FixWordAutoFormat (C#)
Text pasted from Word into an ASP.NET Web Form can cause issues when posting to a database (the characters are replaced with '?' when the page content-type is not windows-1252). This function helps to fix the issue, by replacing those characters with ISO-8859-1 / UTF-8 friendly alternatives.
/// <summary> /// Fixes text auto formatted by Word (em/en dashes, smart quotes, bullet, ellipses) /// </summary> /// <param name="input">String containing auto formatted text</param> /// <returns>String without auto formatting</returns> public static string FixWordAutoFormat(string input) { // replace en-dash input = input.Replace("–", "-"); // replace em-dash input = input.Replace("—", "-"); // replace open single quote input = input.Replace("‘", "'"); // replace close single quote input = input.Replace("’", "'"); // replace open double quote input = input.Replace("“", "\""); // replace close double quote input = input.Replace("”", "\""); // replace bullets input = input.Replace("•", "*"); // replace ellipses input = input.Replace("…", "..."); return input; }
Tags: Web Developer Blog, CSharp, Word AutoFormat
Comments