In a DTD, you can determine that an element can contain both #PCDATA and other elements. This type of content is called mixed. To specify a mixed content type, it suffices to list #PCDATA along with valid child elements.
<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT, NUMBER, PRICE)> <!--mixed--> <!ELEMENT PRODUCT (#PCDATA | PRODUCT_ID )*> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ELEMENT PRODUCT_ID (#PCDATA)> ]> <DOCUMENT> <CUSTOMER> <NAME> <LAST_NAME>Smith</LAST_NAME> <FIRST_NAME>Sam</FIRST_NAME> </NAME> <DATE>October 15, 2003</DATE> <ORDERS> <ITEM> <PRODUCT>Tomatoes</PRODUCT> <NUMBER>8</NUMBER> <PRICE>$1.25</PRICE> </ITEM> <ITEM> <PRODUCT> <PRODUCT_ID> 124829548702121 </PRODUCT_ID> </PRODUCT> <NUMBER>24</NUMBER> <PRICE>$4.98</PRICE> </ITEM> </ORDERS> </CUSTOMER> </DOCUMENT> I noticed when checking the correctness of the file using the so-called. validators (.NET XML Parser, MSXML SAX, MSXML DOM, Java build-in), that if #PCDATA is at the top of the list - the check passes. If there is any element in front of #PCDATA , validation errors appear (each parser has its own, but the essence is the same).
<?xml version = "1.0" standalone="yes"?> <!DOCTYPE DOCUMENT [ <!ELEMENT DOCUMENT (CUSTOMER)*> <!ELEMENT CUSTOMER (NAME,DATE,ORDERS)> <!ELEMENT NAME (LAST_NAME,FIRST_NAME)> <!ELEMENT LAST_NAME (#PCDATA)> <!ELEMENT FIRST_NAME (#PCDATA)> <!ELEMENT DATE (#PCDATA)> <!ELEMENT ORDERS (ITEM)*> <!ELEMENT ITEM (PRODUCT, NUMBER, PRICE)> <!-- mixed --> <!-- ошибка. Почему? --> <!ELEMENT PRODUCT (NUMBER | #PCDATA | PRODUCT_ID )*> <!ELEMENT NUMBER (#PCDATA)> <!ELEMENT PRICE (#PCDATA)> <!ELEMENT PRODUCT_ID (#PCDATA)> ]> <DOCUMENT> <CUSTOMER> <NAME> <LAST_NAME>Smith</LAST_NAME> <FIRST_NAME>Sam</FIRST_NAME> </NAME> <DATE>October 15, 2003</DATE> <ORDERS> <ITEM> <PRODUCT>Tomatoes</PRODUCT> <NUMBER>8</NUMBER> <PRICE>$1.25</PRICE> </ITEM> <ITEM> <PRODUCT> <PRODUCT_ID> 124829548702121 </PRODUCT_ID> </PRODUCT> <NUMBER>24</NUMBER> <PRICE>$4.98</PRICE> </ITEM> </ORDERS> </CUSTOMER> </DOCUMENT> Why should #PCDATA be in the first place in the mixed element?
#PCDATAshould always be in the first place. Therefore, in fact, the question arose. Just in the specification, the following is given:sh Mixed ::= '(' S? '#PCDATA' (S? '|' S? Name)* S? ')*' | '(' S? '#PCDATA' S? ')'sh Mixed ::= '(' S? '#PCDATA' (S? '|' S? Name)* S? ')*' | '(' S? '#PCDATA' S? ')'is, it is written in the specification that the data elementScan be in the first place (or not -?). There are, of course, examples:sh <!ELEMENT p (#PCDATA|a|ul|b|i|em)*> <!ELEMENT b (#PCDATA)>But they do not describe all possible options. - java1cprogS?specificationS?- these are whitespace characters. Now everything is clear! - java1cprog