Find and Replace in Word using C# .NET

Heads up, this article contains high quantity of geek content. Non-geeks should move along.

I’ve been trying to use Microsoft.Office.Interop.Word to perform a global bulk search and replace operations across an entire document. The problem was, however, if a document contained a floating text box, which manifested itself as a shape object of type textbox, the find and replace wouldn’t substitute the text for that region. Even using Word’s capability to record a macro and show the VBA code wasn’t helpful, as the source code in BASIC wasn’t performing the same operation as inside the Word environment.

What I wanted was a simple routine to replace text anywhere inside of a document. If you Google for this you’ll get the wrong kind of textbox, the wrong language, people telling you not to use floating textboxes, and all kinds of weird story iterators.

One site seemed to have the solution; many kind thanks to Doug Robbins, Greg Maxey, Peter Hewett, and Jonathan West for coming up with this solution and explaining it so well.

However, the solution was in Visual Basic for Applications, and I needed a C# solution for a .NET project. Here’s my port, which works with Office 2010 and Visual Studio 2010 C#/.NET 4.0. I’ve left a lot of redundant qualifiers and casting on to help people searching for this article.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using Microsoft.Office.Interop.Word;

// BEGIN: Somewhere in your code
Application app = null;
Document doc = null;
try
{
  app = new Microsoft.Office.Interop.Word.Application();

  doc = app.Documents.Open(filename, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing);

  FindReplaceAnywhere(app, find_text, replace_text);

  doc.SaveAs(outfilename, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing, Missing);
}
finally
{
  try
  {
      if (doc != null) ((Microsoft.Office.Interop.Word._Document) doc).Close(true, Missing, Missing);
  }
  finally { }
  if (app != null) ((Microsoft.Office.Interop.Word._Application) app).Quit(true, Missing, Missing);
}
// END: Somewhere in your code             



// Helper
private static void searchAndReplaceInStory(Microsoft.Office.Interop.Word.Range rngStory, string strSearch, string strReplace)
{
    rngStory.Find.ClearFormatting();
    rngStory.Find.Replacement.ClearFormatting();
    rngStory.Find.Text = strSearch;
    rngStory.Find.Replacement.Text = strReplace;
    rngStory.Find.Wrap = WdFindWrap.wdFindContinue;

    object arg1 = Missing; // Find Pattern
    object arg2 = Missing; //MatchCase
    object arg3 = Missing; //MatchWholeWord
    object arg4 = Missing; //MatchWildcards
    object arg5 = Missing; //MatchSoundsLike
    object arg6 = Missing; //MatchAllWordForms
    object arg7 = Missing; //Forward
    object arg8 = Missing; //Wrap
    object arg9 = Missing; //Format
    object arg10 = Missing; //ReplaceWith
    object arg11 = WdReplace.wdReplaceAll; //Replace
    object arg12 = Missing; //MatchKashida
    object arg13 = Missing; //MatchDiacritics
    object arg14 = Missing; //MatchAlefHamza
    object arg15 = Missing; //MatchControl

    rngStory.Find.Execute(ref arg1, ref arg2, ref arg3, ref arg4, ref arg5, ref arg6, ref arg7, ref arg8, ref arg9, ref arg10, ref arg11, ref arg12, ref arg13, ref arg14, ref arg15);
}

// Main routine to find text and replace it,
//   var app = new Microsoft.Office.Interop.Word.Application();
public static void FindReplaceAnywhere(Microsoft.Office.Interop.Word.Application app, string findText, string replaceText)
{
    // http://forums.asp.net/p/1501791/3739871.aspx
    var doc = app.ActiveDocument;

    // Fix the skipped blank Header/Footer problem
    //    http://msdn.microsoft.com/en-us/library/aa211923(office.11).aspx
    Microsoft.Office.Interop.Word.WdStoryType lngJunk = doc.Sections[1].Headers[WdHeaderFooterIndex.wdHeaderFooterPrimary].Range.StoryType;

    // Iterate through all story types in the current document
    foreach (Microsoft.Office.Interop.Word.Range rngStory in doc.StoryRanges)
    {

        // Iterate through all linked stories
        var internalRangeStory = rngStory;

        do
        {
            searchAndReplaceInStory(internalRangeStory, findText, replaceText);

            try
            {
                //   6 , 7 , 8 , 9 , 10 , 11 -- http://msdn.microsoft.com/en-us/library/aa211923(office.11).aspx
                switch (internalRangeStory.StoryType)
                {
                    case Microsoft.Office.Interop.Word.WdStoryType.wdEvenPagesHeaderStory: // 6
                    case Microsoft.Office.Interop.Word.WdStoryType.wdPrimaryHeaderStory:   // 7
                    case Microsoft.Office.Interop.Word.WdStoryType.wdEvenPagesFooterStory: // 8
                    case Microsoft.Office.Interop.Word.WdStoryType.wdPrimaryFooterStory:   // 9
                    case Microsoft.Office.Interop.Word.WdStoryType.wdFirstPageHeaderStory: // 10
                    case Microsoft.Office.Interop.Word.WdStoryType.wdFirstPageFooterStory: // 11

                        if (internalRangeStory.ShapeRange.Count > 0)
                        {
                            foreach (Microsoft.Office.Interop.Word.Shape oShp in internalRangeStory.ShapeRange)
                            {
                                if (oShp.TextFrame.HasText != 0)
                                {
                                    searchAndReplaceInStory(oShp.TextFrame.TextRange, findText, replaceText);
                                }
                            }
                        }
                        break;

                    default:
                        break;
                }
            }
            catch
            {
                // On Error Resume Next
            }

            // ON ERROR GOTO 0 -- http://www.harding.edu/fmccown/vbnet_csharp_comparison.html

            // Get next linked story (if any)
            internalRangeStory = internalRangeStory.NextStoryRange;
        } while (internalRangeStory != null); // http://www.harding.edu/fmccown/vbnet_csharp_comparison.html
    }

}

Let me know if it worked for you; bug fixes and enhancements welcome.

This entry was posted in C#, Geek, Hack, How To, Microsoft, Software, Trick, Windows, Workaround and tagged , , , , , , , , , , , , , , , , , , , , , , , , , , , . Bookmark the permalink.

2 Responses to Find and Replace in Word using C# .NET

  1. Tamas Somogyi says:

    Thanks for this useful article, it works indeed: replaces the texts in boxes, headers, main text body, etc.
    Just two remarks:
    1) I guess the try-catch block is needed for workaround MS bug in Range.ShapeRange call as it throws exception instead of returning null/0.
    2) If your document doesn’t contain text boxes, try iterating through sections as it is faster than stories:
    foreach (Section s in doc.Sections) {
    foreach (HeaderFooter h in sct.Headers)
    searchAndReplaceInStory(h.Range, findText, replaceText);
    foreach (HeaderFooter f in sct.Footers)
    searchAndReplaceInStory(f.Range, findText, replaceText);
    searchAndReplaceInStory(s.Range, findText, replaceText);
    }
    For my documents (15 pages legal texts), replacing 50 texts takes 23 seconds using Stories, but only 12 second using Ranges. If you know an even faster solution, please post it.

  2. Stijn Bollen says:

    Thanks for this useful article.

    One general rule when working with office interop:
    Never use more then 2 dots in a statement.
    Declare temporary variables and dispose of them after u are done using them.

    Try to avoid creating multiple objects with the same content.
    e.g. Instead of declaring 14 arguments with the value Missing,
    you could just create 1 object:
    object oMissing = System.Reflection.Missing.Value;

    Instead of using
    Microsoft.Office.Interop.Word...
    over and over again, you could use
    using Word = Microsoft.Office.Interop.Word;
    in your imports. By doing this, you can skip “Microsoft.Office.Interop.” part in the rest of your code.

    To make your code more readable I would switch code and comment in the following section:
    case Microsoft.Office.Interop.Word.WdStoryType.wdEvenPagesHeaderStory: // 6
    case 6: // Word.WdStoryType.wdEvenPagesHeaderStory

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>