Users Online

We have 60 guests online
Reading Writing Word Doc PDF Print E-mail
Written by Zack MIlls   
Thursday, 22 October 2009 22:16

http://asptutorials.net/C-SHARP/accessing-ms-word-documents-from-dot-net/

 

Accessing Microsoft Word Documents Programmatically from C# .Net 2

This article describes how to open a Word document from within .Net code, and read the .doc file contents, or write to it. This article contains snippets of the C# code necessary to do this.

Configuring

Solution Explorer / Add Reference...

Add a reference to Microsoft Word 11.0 Object Library (if you have MS Office 2003).

To access a Word document we need to do these steps:

  1. Create an instance of Word.ApplicationClass
  2. Use the applicationclass to open a Word.Document
  3. Process the document, either reading from it or writing to it.
  4. Close the Word Document (optionally saving changes)
  5. Quit the applicationclass
  6. Discard the left over COM object. (This is to overcome what I believe is a bug. If you don't do this, you will find lots of WORD.EXE instances in your task manager)

Each of these steps is described below:

Creating a Word Application

This code will create the word application:

Word.ApplicationClass wordApplication;
try
{
wordApplication = new Word.ApplicationClass();
}
catch (Exception e)
{ MessageBox.Show("ERROR! Do you have MS Word installed? "+ e.Message.ToString()); }

Opening the Word Document

The function that opens Word documents requires a lot of parameters, and many of them are so obscure that you will never need them. However, you have to provide a value to the function for each parameter, and this value must be an object, so you need to create an object to represent an unused parameter (I've called mine o_null). You also need to create objects for each of the parameters that you do use, so I create o_true and o_false for parameters that take a boolean, and other objects first before calling the Open function.

object o_null = System.Reflection.Missing.Value;

object o_true = ???
object o_false = ???
object o_filePath = @"C:\my\filename.doc";
Word.Document doc;
try
{
   doc = wordApplication.Documents.Open(ref o_filePath,
   ref o_null, ref o_null, ref o_null, ref o_null, ref o_null,
   ref o_null, ref o_null, ref o_null, ref o_null, ref o_null,
   ref o_null, ref o_null, ref o_null, ref o_null, ref o_null);
}
catch (Exception e)
{
	MessageBox.Show("ERROR! Couldn't open that file. "+ e.Message.ToString());
}

Do Something With The Content

If you want to read text from the file, you can use:

string alltext = doc.Content.Text;

which will get the entire document contents as a string. This will contain no formatting except for paragraph breaks indicated by CHAR(13) characters.

Using ranges and paragraphs

Accessing Form fields and checkboxes

Unprotecting the document and protecting it again

Inserting text into it

Making changes using Find / Replace

Saving the document

To save the document requires another cumbersome function with far too many parameters. In this example I use the wdFormatHTML to convert the file to HTML format, but this isn't necessary:

object o_filename = @"C:\a\nice\filename.doc";
object o_format = Word.WdSaveFormat.wdFormatHTML;
object o_encoding = Microsoft.Office.Core.MsoEncoding.msoEncodingUTF8;
object o_endings = Word.WdLineEndingType.wdCRLF;
try
{
   wordApplication.ActiveDocument.SaveAs(ref o_filename, ref o_format, ref o_null,
   ref o_null, ref o_null, ref o_null, ref o_null, ref o_null, ref o_null,
   ref o_null, ref o_null, ref o_encoding, ref o_null,
   ref o_null, ref o_endings, ref o_null);
}
catch (Exception e)
{
	MessageBox.Show("ERROR! Couldn't save that file. "+ e.Message.ToString());
}

Closing the Word Instance

You must remember to quit the Word application and to discard the COM object afterwards:

doc.Close(ref o_null, ref o_null, ref o_null);
wordApplication.Quit(ref o_null, ref o_null, ref o_null);
System.Runtime.InteropServices.Marshal.ReleaseComObject(wordApplication);

First, the .Close command closes the document. Second, the .Quit command quits the Word Application. However, due to some sort of bug, you are always left with a WINWORD.EXE rogue process left in task manager, and this hogs resources. So, .ReleaseComObject gets rid of it.

So, that's it. It's very powerful, very useful, but a bit tricky to open Word documents from .net and do things with the contents.

 

Last Updated on Thursday, 22 October 2009 22:28