As previously noted, you can use the Microsoft OpenXML library, which can be downloaded here. So after installation in your project you need to connect the following assembly:
- DocumentFormat.OpenXml
- WindowsBase
To work with the text you need to connect the namespace:
using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Wordprocessing;
I created for demonstration a small console application for an example, below is a function that parses the text formatting that is contained in the file along the path . This function displays the text from the document and the formatting options of each text area:
static void ReadDocx(string path) { try { using (var doc = WordprocessingDocument.Open(path, false)) { foreach (var p in doc.MainDocumentPart.Document.Body.Elements<Paragraph>()) { foreach (var r in p.Elements<Run>()) { Console.WriteLine(r.InnerText); Console.WriteLine("Является:"); if (r.RunProperties.Bold != null) Console.WriteLine("Жирный"); if (r.RunProperties.Italic != null) Console.WriteLine("Курсив"); if (r.RunProperties.Underline != null) Console.WriteLine("Подчёркнутый"); if (r.RunProperties.Strike != null) Console.WriteLine("Перечеркнутый"); if (r.RunProperties.VerticalTextAlignment != null) { if (r.RunProperties.VerticalTextAlignment.Val == VerticalPositionValues.Subscript) Console.WriteLine("Подстрочный"); if (r.RunProperties.VerticalTextAlignment.Val == VerticalPositionValues.Superscript) Console.WriteLine("Надстрочный"); } } } } } catch (Exception ex) { Console.WriteLine(ex.Message); } }
StreamWriterwrites the text as a simple sequence of characters, here the concept of style is simply not applicable. - Bulson