How can I open a docx document and remove the very first line from there using OpenXML?

Or delete some specific text? (I know the text of this line completely)

  • Probably the first paragraph. Because the "string" ... I think there is no such definition in OpenXML for text. The length of the line depends on the width of the screen, and the location of the element "paragraph" (w: p). - nick_n_a

1 answer 1

If the very essence of the removal is important, then here is the code how to do it.

First, we include the necessary libraries to support OpenXML:

using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Wordprocessing; 

Next, create an object of type WordprocessingDocument and read and write permissions:

 WordprocessingDocument doc = WordprocessingDocument.Open(fileName, true); 

After this, we consider in the stream the main contents of the root element of the document and remember:

  using (StreamReader reader = new StreamReader(doc.MainDocumentPart.GetStream())) { text = reader.ReadToEnd(); } 

After that, it remains to replace (delete) the necessary line (data) and write back to the stream. I cite all code at once:

 using System; using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Wordprocessing; using System.IO; using System.Text.RegularExpressions; namespace train { class Program { static void Main(string[] args) { string fileName = @"C:\HelloWorld.docx"; string pattern = "Hello"; WordprocessingDocument doc = WordprocessingDocument.Open(fileName, true); string text = string.Empty; using (StreamReader reader = new StreamReader(doc.MainDocumentPart.GetStream())) { text = reader.ReadToEnd(); } Regex rg = new Regex(pattern); text = rg.Replace(text, ""); using (StreamWriter writer = new StreamWriter(doc.MainDocumentPart.GetStream(FileMode.Create))) { writer.Write(text); } doc.Close(); Console.WriteLine("Press any key..."); Console.ReadKey(); } } }