There is a file format .doc / .docx. There are tags in this file, for example:

... [FirstName][MiddleName][LastName] ... [Date] 

It is necessary to replace labels, values. Values ​​come to function from js.

I use c #.

Any suggestions.

  • Search for a library to work with doc-files. Honestly. The choice is small (in fact, in nuget there is only one). Specify, it is about doc , not about docx ? - Alexander Petrov
  • Maybe any. The one with which it will be easier to solve this problem. - endovitskiiy
  • What does any mean? txt too maybe? - Alexander Petrov
  • no .doc or .docx - endovitskiiy
  • With libraries for docx, the situation is much better. Search for keywords OpenXml , ClosedXml . Or, for example. - Alexander Petrov

3 answers 3

Option via Open XML SDK 2.5 for .docx . A link is required to build WindowsBase and the SDK itself. Microsoft Office is not required to work. The only problem is that the Paragraph element's InnerText cannot be changed directly, only through the Run elements, in which the text itself is stored, that is, the replacement in this case loses the text formatting that is stored in the RunProperties . Understanding the code and classes of the SDK is easiest through the reflector files in the code link .

Yellow tags are allocated for the demonstration, for work is not necessary.

 using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Text.RegularExpressions; using DocumentFormat.OpenXml.Packaging; using DocumentFormat.OpenXml.Wordprocessing; using Paragraph = DocumentFormat.OpenXml.Wordprocessing.Paragraph; using Run = DocumentFormat.OpenXml.Wordprocessing.Run; using Text = DocumentFormat.OpenXml.Wordprocessing.Text; namespace TestC { class Program { static void Main(string[] args) { string initialPath = @"C:\Users\User\Documents\TestDocument.docx"; string resultPath = @"C:\Users\User\Documents\TestDocument_result.docx"; File.Copy(initialPath, resultPath, overwrite: true); Dictionary<string, string> marks = new Dictionary<string, string>() { { "FirstName","Иван"}, { "LastName","Иванов"}, { "Date",DateTime.Now.ToLongDateString()}, { "Initials","Иван И. И."}, { "DateIssued",DateTime.Now.AddDays(5).ToLongDateString()} }; using (WordprocessingDocument document = WordprocessingDocument.Open(resultPath, true)) { Body documentBody = document.MainDocumentPart.Document.Body; List<Paragraph> paragraphsWithMarks = documentBody.Descendants<Paragraph>().Where(x => Regex.IsMatch(x.InnerText, @".*\[\w+\].*")).ToList(); foreach (Paragraph paragraph in paragraphsWithMarks) { foreach (Match markMatch in Regex.Matches(paragraph.InnerText, @"\[\w+\]", RegexOptions.Compiled)) { string paragraphMarkValue = markMatch.Value.Trim(new[] { '[', ']' }); string markValueFromCollection; if (marks.TryGetValue(paragraphMarkValue, out markValueFromCollection)) { string editedParagraphText = paragraph.InnerText.Replace(markMatch.Value, markValueFromCollection); paragraph.RemoveAllChildren<Run>(); paragraph.AppendChild<Run>(new Run(new Text(editedParagraphText))); } } } } } } } 

Before:

Document to

After:

Document after

  • When replacing lost formatted only replaced text? - endovitskiiy
  • @endovitskiiy Total paragraph, where it is contained. But there part of the styles go to the entire paragraph (Heading 1 style), for example, and are specified in the ParagraphProperties of the Paragraph. They are not lost. Such things as slope, italics, which may be indicated in the RunProperties on the text area (Run) inside the Paragraph, are lost. It's not so easy there, of course ... - Vladislav Khapin
  • I have no slopes / italics in the document - no formatting. Only text and tables and indents. Are indentations lost (in a paragraph)? - endovitskiiy
  • @ endovitsky, now I tried it - they did not get lost. Indents and Spacing - Vladislav Khapin
  • 2
    Well ... I would call OpenXML probably a more correct approach than working with the Microsoft Word process through COM Interop. - Vladislav Khapin

Instead of labels it is better to use DocVariable Add variable

then using Microsoft.Office.Interop.Word you can write something like the following:

 using Word = Microsoft.Office.Interop.Word; 

//

 Word._Application application; Word._Document sdoc = null; Object missingObj = System.Reflection.Missing.Value; Object trueObj = true; Object falseObj = false; application = new Word.Application(); Object templatePathObj = @"e:\temp\template.docx"; try { sdoc = application.Documents.Add(ref templatePathObj, ref missingObj, ref missingObj, ref missingObj); } catch (Exception error) { application.Quit(ref missingObj, ref missingObj, ref missingObj); application = null; throw error; } object variable2 = "Здесь будет значение для переменной"; if (sdoc.Variables.Count == 0) { Word.Variable var2 = sdoc.Variables.Add("SomeField", ref variable2); } else { sdoc.Variables["SomeField"].Value = variable2.ToString; } sdoc.Fields.Update(); application.Visible = true; 

If Microsoft Office and Word Interop assemblies are on the computer, then you can use the Interop function. This method will only work if the string length of the replacement value is <256 characters and the replacement string does not contain a newline. If replacement lines are long or there are line breaks, search / replace for each label appearance. The label style will be applied to the replacement string.

 <!-- language: c# --> using Word = Microsoft.Office.Interop.Word; ... Word.Application word = new Word.Application(); word.Visible = true; Word.Document wordDocument = word.Documents.Open(имяВходногоФайлаСМетками); ... var find = word.Selection.Find; find.Text = строкаМетки; find.Replacement.Text = значениеДляЗамены; find.Execute(FindText: Type.Missing, MatchCase: true, MatchWholeWord: true, MatchWildcards: false, MatchSoundsLike: Type.Missing, MatchAllWordForms: false, Forward: true, Wrap: Word.WdFindWrap.wdFindContinue, Format: false, ReplaceWith: Type.Missing, Replace: Word.WdReplace.wdReplaceAll); wordDocument.Save()