Good day to all!

Here is stuck on the Russian- XML 'ke. There is an English version:

 <messages> <parametr> <catalogue id="10" catalogname="Каталог 1"> <name tovname="Товар 1"/> <text tovtext="текст о товаре 1"/> <price internet="95" rozn="100" opt="90"/> </catalogue> </parametr> </messages> 

Next, an example handler:

  foreach ($xml->parametr as $parametr) { foreach ($parametr->catalogue as $catalogue) { print ($catalogue[catalogname]); print ($catalogue[id]); print ($catalogue->name[tovname]); print ($catalogue->price[internet]); print ($catalogue->text[tovtext]); } } 

Everything works in the English version, but XML comes in Cyrillic: i.e. I want to know how it will look like:

 print (${"каталог"}->{"продукт"}[наименование]); 

If in XML this is:

 <каталог идентификатор="10" ... <продукт наименование="Товар 1"/> 

Those. How in this variant to accept the data from XML ?

  • one
    So take. What exactly does not work? - user6550
  • PHP of course works without errors, but for some reason, empty values ​​come. - I_CaR

4 answers 4

In the declaration of the xml document, it is necessary to specify the encoding, in our case, it is utf-8, it is written in Wikipedia why this is necessary.


 $xmlcode = <<<XML <?xml version="1.0" encoding="UTF-8"?> <сообщения> <параметры> <каталог идентификатор="10" название="Каталог 1"> <название>Товар 1</название> <текст>текст о товаре 1</текст> <цена интернет="95" розница="100" опт="90" /> </каталог> </параметры> </сообщения> XML; $s = new SimpleXMLElement($xmlcode); var_dump($s); print "{$s->параметры->каталог->название}"; 


 object(SimpleXMLElement)#1 (1) { ["параметры"]=> object(SimpleXMLElement)#2 (1) { ["каталог"]=> object(SimpleXMLElement)#3 (4) { ["@attributes"]=> array(2) { ["идентификатор"]=> string(2) "10" ["название"]=> string(16) "Каталог 1" } ["название"]=> string(12) "Товар 1" ["текст"]=> string(28) "текст о товаре 1" ["цена"]=> object(SimpleXMLElement)#4 (1) { ["@attributes"]=> array(3) { ["интернет"]=> string(2) "95" ["розница"]=> string(3) "100" ["опт"]=> string(2) "90" } } } } } Товар1 
  • And the use of encoding Windows-1251, something can change in a better (more optimal code)? Or any of this Utf2Win and Win2Utf? - I_CaR
  • From Windows-1251 should be abandoned in the direction of Unicode - so you are less likely to stumble on a rake. Actually, if there is a possibility, in these xml-ka should avoid the use of Cyrillic names in the elements of the document, although the XML specification allows you to do it - nolka
  • And it is possible in this place: object (SimpleXMLElement) # 1 (1) {["parameters"] => ... in more detail. Sorry, this is my first time with XML. Somehow all the time did not come in handy. - I_CaR
  • I used to parse the xml-ki class SimpleXML, part of the php. when loading a valid xml, it creates an object structure corresponding to the xml markup and returns it in a variable. Thus, in order to get the value of the attribute "name" of the element "directory", it is necessary to refer to it as an element of the associative array, i.e., {$ message-> parameters-> directory ['name']}. In general, xml elements become object properties, and node attributes become elements of an associative array attached to a specific property. Sorry for the confusion - nolka
  • It is necessary to try, but it looks daunting: string (28) After all, XML will contain several thousand names. - I_CaR

messages.xml :

 <?xml version="1.0" encoding="UTF-8"?> <сообщения> <параметры> <каталог идентификатор="10" название="Каталог 1"> <название>Товар 1<название/> <текст>текст о товаре 1</текст> <цена интернет="95" розница="100" опт="90"/> </каталог> </параметры> </сообщения> 

messages.php :

 $parser = xml_parser_create(); xml_parser_set_option( $parser, XML_OPTION_CASE_FOLDING, 0 ); if(($h = fopen('messages.xml', 'r'))) { $data=fread($h,100000000); // хи-хи fclose($h); xml_parse($parser, $data, true); var_dump( $data ); unset($data); } 

conclusion :

 string(422) "<?xml version="1.0" encoding="UTF-8"?> <сообщения> <параметры> <каталог идентификатор="10" название="Каталог 1"> <название>Товар 1<название/> <текст>текст о товаре 1</текст> <цена интернет="95" розница="100" опт="90"/> </каталог> </параметры> </сообщения> " 
  • Wow! Like this? And if it is XML about 5k products? + 20-30 parameters to it? Well, for such a string variable to turn out? And why did you get it with line breaks? And what else is the second handler to write, by pulling data from this variable? - I_CaR
  • Sorry, but these questions are not at the cashier. I just showed that in XML everything with encoding and strings is good, and where exactly it does not work for you is telepaths on vacation. Just in case, I clarify: this is an example, convention, abstraction. And what is var_dump () and how to work with the xml_parse () result - read the tutorials, no second handler is needed. - user6550

Awful, never loved for it the bitrix and 1C. that all the XML is in Russian. This gut as uncomfortable, it is used only here. It was always easier to write everything in ENG, thereby simplifying the lives of other developers, including foreign ones, if they suddenly need some data from you.

It is always easier to translate dozens of parameters and names than to explain why it is written here in Russian. IMHO.

 <messages> <parametr> <catalogue id="10" catalogname="Каталог 1"> <name tovname="Товар 1"/> <text tovtext="текст о товаре 1"/> <price internet="95" rozn="100" opt="90"/> </catalogue> </parametr> </messages> 

This is a normal XML, readable, convenient. So for reference, you will find yourself somewhere in France at the computer where there is no Russian layout and language, good luck in editing this XML.

Yes, you can put the Russian language and so on, but it will be problematic to edit. So the Russian language is Bue ... just like writing sites in the encoding win1251 instead of utf8.

  • Shrek, I myself see and understand that this is bue ... And before, even the interface of the application software was written in Latin. But this is how you almost got it - a bunch of 1C + NetCat (all the development of our coders) Therefore, 1C unloads XML absolutely in the Cyrillic version + encoding win1251! So I suffer, and set the task of optimizing the boot, it hints - “but they wrote to NetCat (Moscow time)”. And the XML that was filtered by you, I did this to myself to study the principles of transmission. - I_CaR
  • one
    And I would also give a hint to the person who set the task that it is then easier to buy a finished product from them rather than writing your own rake, since it is so smart. - Artem
  • Buy? It is about whether you need to pay so much money to the programmer. And in general, is it necessary? Ultimatum in one word. The fact of the matter is that the PHP module from NetCat works, but here, like, the purchased ones do not suit them, and they want to test my qualifications at the same time, to use my optimized economy class with minimal checks and specifically without any frills. from netcat. In style - took data from XML, made UPDATE in the database. Well, because they have, as it were, right after all. I offered to upload to a local database, then synchronize the local database with the site database. Postponed. - I_CaR
  • sometimes it's easier to buy ready-made rather than write your own. I sometimes offer such ideas to my superiors. For it is really sometimes easier to buy. - Artem

In the declaration of the xml document, you must specify the encoding, in our case, this is utf-8

Everything! Solved the problem! Not only by specifying the XML encoding in the "header ('... charset ...)" code, but also by the + conversion of PHP itself. (and it’s still local, what else waits on the server at the hoster is not known ...)

Here is a Cyrillic pancake, and who is promoting it?

  • Those who still do not know that Unicode is our everything) - nolka
  • Bitrix promotes it and 1C - monopolists among the garbage can! So much junk in the code, as they understand there! - Artem