The bottom line is that I pull out the text from document.xml from the word document. I get a string with tags. There is the following piece of string:

<w:p w14:paraId="6EA357F9" w14:textId="56718C92" w:rsidR="00EB1C12" w:rsidRPr="00D25518" w:rsidRDefault="004C2F3B" w:rsidP="004C2F3B"> <w:pPr> <w:pStyle w:val="a3"/> <w:ind w:firstLine="0"/> <w:jc w:val="center"/> <w:rPr> <w:rFonts w:cs="Times New Roman"/> <w:szCs w:val="28"/> <w:lang w:val="ru-RU"/> </w:rPr> </w:pPr> <w:r> <w:rPr> <w:rFonts w:cs="Times New Roman"/> <w:sz w:val="32"/> <w:szCs w:val="28"/> <w:lang w:val="ru-RU"/> </w:rPr> <w:t>АННОТАЦИЯ</w:t> </w:r> <w:bookmarkStart w:id="0" w:name="_GoBack"/> <w:bookmarkEnd w:id="0"/> </w:p> 

I want to replace for example </w:t> with \n str_replace('</w:t>', "\n", $file); but nothing comes out. I want to split with the following regular preg_match_all("#<w:t>(.*)</w:t>#", $file, $matches); On blocks, approximately as in an example above, but too quits nothing, $matches empty. Am I doing something wrong? Maybe it is possible to replace the regulars with something easier? It would be desirable to get an array from the entire document.xml, which will be divided into blocks <w:p> ... </w:p> . By the way, I tried using simplexml_load_string

 $xml = simplexml_load_string($file); var_dump($xml->w:p); 

but also nothing comes out. Please tell me.

  • var_dump($xml->w:p); - and what, PHP does not give out any error on it? - PinkTux
  • Yes, a mistake. I wrote figuratively. Tried $xml->w also an error. More simplexml_load_string(): Entity: line 1: parser error : Start tag expected, '<' not found . The campaign does not see <xml at the beginning UPD My cant, I do a little wrong. On $xml->w produces object(SimpleXMLElement)#3 (0) { } - Cuthbert_Allgood

1 answer 1

I tried it and it works if it doesn't work for you most likely because of PHP:

 <?php $content = '<w:p w14:paraId="6EA357F9" w14:textId="56718C92" w:rsidR="00EB1C12" w:rsidRPr="00D25518" w:rsidRDefault="004C2F3B" w:rsidP="004C2F3B"> <w:pPr> <w:pStyle w:val="a3"/> <w:ind w:firstLine="0"/> <w:jc w:val="center"/> <w:rPr> <w:rFonts w:cs="Times New Roman"/> <w:szCs w:val="28"/> <w:lang w:val="ru-RU"/> </w:rPr> </w:pPr> <w:r> <w:rPr> <w:rFonts w:cs="Times New Roman"/> <w:sz w:val="32"/> <w:szCs w:val="28"/> <w:lang w:val="ru-RU"/> </w:rPr> <w:t>АННОТАЦИЯ</w:t> </w:r> <w:bookmarkStart w:id="0" w:name="_GoBack"/> <w:bookmarkEnd w:id="0"/> </w:p>'; $replace = str_replace('</w:p>', '\n', $content); echo $replace; ?>