Hello!
Please help us with canonization of XML ( http://www.w3.org/2001/10/xml-exc-c14n# - Canonicalization of XML) of the following type:

<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd" xmlns:ws="http://ws.unisoft/" xmlns:rev="http://smev.gosuslugi.ru/rev120315" xmlns:rq1="http://ws.unisoft/CPSubPercent/Rq1" xmlns:inc="http://www.w3.org/2004/08/xop/include"> <soapenv:Header> <wsse:Security soapenv:actor="http://smev.gosuslugi.ru/actors/smev"> <ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> <ds:KeyInfo> <wsse:SecurityTokenReference> <wsse:Reference URI="#SenderCertificate" /> </wsse:SecurityTokenReference> </ds:KeyInfo> </ds:Signature> <wsse:BinarySecurityToken EncodingType="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-soap-message-security-1.0#Base64Binary" ValueType="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-x509-token-profile-1.0#X509v3" wsu:Id="SenderCertificate" /> </wsse:Security> </soapenv:Header> <soapenv:Body wsu:Id="body"> <ws:async_getId_SendRequest_Tyrim_Pyrim> <rev:Message> <rev:Sender> <rev:Code>CODE63544</rev:Code> <rev:Name>ТутИмя</rev:Name> </rev:Sender> <rev:Recipient> <rev:Code>CODE2</rev:Code> <rev:Name>Тест</rev:Name> </rev:Recipient> <rev:Originator> <rev:Code>CODE3</rev:Code> <rev:Name>Тест</rev:Name> </rev:Originator> <rev:ServiceName>ТестВебСервис</rev:ServiceName> <rev:TypeCode>GSRV</rev:TypeCode> <rev:Status>REQUEST</rev:Status> <rev:Date>2012-03-13T12:12:12Z</rev:Date> <rev:ExchangeType>1</rev:ExchangeType> <rev:RequestIdRef /> <rev:OriginRequestIdRef /> <rev:ServiceCode /> <rev:CaseNumber /> <rev:TestMsg /> </rev:Message> <rev:MessageData> <rev:AppData> <rq1:Документ ВерсияФормата="1.0" UIDЗапроса="4832bdef-bef7-459a-8397-dc28793f59d4"> <РегНомер>1</РегНомер> </rq1:Документ> </rev:AppData> <rev:AppDocument> <rev:RequestCode>req_4832bdef-bef7-459a-8397-dc28793f59d4</rev:RequestCode> <rev:BinaryData>UEsDBBQAAAAIABm=</rev:BinaryData> </rev:AppDocument> </rev:MessageData> </ws:async_getId_SendRequest_Tyrim_Pyrim> </soapenv:Body> </soapenv:Envelope> 

That is, it is an XML SOAP request.
The problem is that I don’t know how to bring XML to a canonical form (canonicalize) and therefore I can’t do it manually, and I need to program it (at least, if not canonicalization of arbitrary XML, then at least XML generation is already in canonicalized form).

Of course, I read the recommendations ( http://www.w3.org/TR/xml-c14n ), but I didn’t really understand what I need to do with XML ...
Perhaps the only thing that is clear is that there should be no extra spaces in the tags, the tags "<tag />" should be turned into "<tag>".
But this is clearly not enough.

I ask the help of those who have already encountered this ...

PS Answering the question “Why this is necessary” is ideal for this , but a simple understanding of what kind of XML should be in the end will come down ...

  • I doubt about the non-English tags and attributes that are fully in your text: <rq1:Документ ВерсияФормата="1.0" UIDЗапроса="4832bdef-bef7-459a-8397-dc28793f59d4"> <РегНомер>1</РегНомер> </rq1:Документ> - Barmaley
  • These are tags for 1C, so it should be ... - t1nk
  • klopp In that case I do not understand why not to use C # initially. Since there are still some external utilities used. Or it is necessary under unix? In general, you need to work in the browser, but first you need to repeat what you get in C # manually, or with some kind of automation. Hence the question. - t1nk
  • And the source code C # class is not found? - user6550
  • How do you imagine searching in native code? Base classes are represented by DLL. - t1nk

1 answer 1

 #!/usr/bin/perl use strict; use XML::LibXML; my $parser = XML::LibXML->new(); my $doc = $parser->parse_file('1.xml'); print $doc->toStringC14N(); 
  • Issued an error: Microsoft Windows [Version 6.1.7601] (c) Microsoft Corporation (Microsoft Corp.), 2009. All rights reserved. c: \ Perl> canonicalization.pl Can't locate XML / LibXML.pm in @INC (@INC contains: C: / Perl / site / lib C: / Perl / lib.) at C: \ Perl \ canonicalization.pl line 4. BEGIN failed - compilation aborted at C: \ Perl \ canonicalization.pl line 4. c: \ Perl> - t1nk
  • 3
    KO hints that the module needs to be installed. For example, so, if it is ActiveState: ppm install XML::LibXML - user6550
  • Thanks, I realized that something was missing, but I didn’t know how to solve it ... I didn’t have to work with perl after uni. By the way, this is exactly canonization according to [c14n] [1] [1]: w3.org/2001/10/xml-exc-c14n# - t1nk
  • 2
    There is also a toStringEC14N () method, which is what they write about it:> XML-EXC-C14N Specification (see w3.org/TR/xml-exc-c14n )> for exclusive canonization of XML. You can dig deeper with the arguments here: search.cpan.org/~shlomif/XML-LibXML-2.0014/lib/XML/LibXML/… - user6550
  • This is closer to what I expected intuitively, but nevertheless, the result for the cache calculation algorithm is still not the same ... The cache does not match ... I will continue to smoke this topic, thank you. - t1nk