FixWordAutoFormat (C#)
Text pasted from Word into an ASP.NET Web Form can cause issues when posting to a database (the characters are replaced with '?' when the page content-type is not windows-1252). This function helps to fix the issue, by replacing those characters with ISO-8859-1 / UTF-8 friendly alternatives.
/// <summary>
/// Fixes text auto formatted by Word (em/en dashes, smart quotes, bullet, ellipses)
/// </summary>
/// <param name="input">String containing auto formatted text</param>
/// <returns>String without auto formatting</returns>
public static string FixWordAutoFormat(string input)
{
// replace en-dash
input = input.Replace("–", "-");
// replace em-dash
input = input.Replace("—", "-");
// replace open single quote
input = input.Replace("‘", "'");
// replace close single quote
input = input.Replace("’", "'");
// replace open double quote
input = input.Replace("“", "\"");
// replace close double quote
input = input.Replace("”", "\"");
// replace bullets
input = input.Replace("•", "*");
// replace ellipses
input = input.Replace("…", "...");
return input;
}
Tags: Web Developer Blog, CSharp, Word AutoFormat
Comments