Wednesday, April 9, 2008

XML Encoding Problem In SQL Server

In a project i was sending an XML file to a SP which was written by someone. It was insering values to a master table and some details table. The SP was taking some TEXT datatype as input for the XML. Now the problem was one day we found some
"XML parsing error: An invalid character was found in text content. sql server xml parse" error and i come to know that some datatype is having some non-ascii characters like "Ö","«" for which the error was generated.
We were told to just avoid throughing error so that the user can continue their work. The fastest solution i came accross was to format the string before puttin into the XML.
So i wrote a function which will take the String as input and return the ASCII string as output as below.
public static string getUTF8String(string str) {
char[] ch = str.ToCharArray();
byte[] by = new byte[ch.Length * 2]; //If all char is UNicode then max byte is 2 X length System.Text.ASCIIEncoding enc = new ASCIIEncoding();
int charUsed, byteUsed;
bool completed;
enc.GetEncoder().Convert(ch, 0, ch.Length, by, 0, by.Length, true, out charUsed, out byteUsed, out completed);
string res=enc.GetString(by, 0, byteUsed);
return res;
}
So the Non-Ascii character may change, but the workflow wsa not hampered.
But the best solution is to change the SP and the xml sent to the SP. In the SP you should make the input parameter as NTEXT rather than TEXT and in the XML mention the encoding as UTF-16 - not UTF-8. In this way it will accept all unicode characters.

No comments: