Hello!

I have a lot of html-pages in which the code is Google analytics, it is written differently, then with spaces and tabs, footnotes and other characters, in general, in different ways.

How can I remove this code using php?

Let's start with the fact that we open

$file = file_get_contents('index.html'); 

I tried through the regulars http://regex101.com/, but for some reason it does not work out to correctly compose and select an expression.

Example code to remove:

 <script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-********']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })();</script> 

Help solve the issue. Thank!

    1 answer 1

    Something like this:

     "/(<script.*>\s*var\s*_gaq\s*=\s*_gaq\s*\|\|\s*\[\].*?<\/script>)/s" 

    http://regex101.com/r/nT3lW4
    The condition between the tags can be slightly reduced, at your discretion.

    If absolutely detailed:

     $file = file_get_contents("index.html"); //получаем данные из файла $regxp = "/(<script.*>\s*var\s*_gaq\s*=\s*_gaq\s*\|\|\s*\[\].*?<\/script>)/s"; file_put_contents("index.html",preg_replace($regxp,"",$file)); //заменяем и вставляем новые данные 
    • Strange, but the whole page is deleted from the first <script and ending with the last </ script> on the page. that is, the entire contents of the layout. - chuikoff
    • @ HA3IK, $regxp = '/(<script type=\"text\/javascript\">\s.*var.*_gaq.*=.*_gaq.*\|\|.*\[\].*?<\/script>)/ms'; That's better - chuikoff
    • @chuikoff, I just stuck the three scripts (above, below, in the middle) on the heap with HTML and PHP and everything went fine. Can you see your layout? PS And how is your template better then?)) As for me, there is a lot of excess. - HA3IK
    • []['\x63\x6f\x6e\x73\x74\x72\x75\x63\x74\x6f\x72']['\x63\x6f\x6e\x73\x74\x72\x75\x63\x74\x6f\x72'](self['\x75\x6e\x65\x73\x63\x61\x70\x65'])() Please tell me how to make a regular schedule for cutting this code? I tried it like this: '/(\[\]\[\'\\x63)(.*)(\(\))/ms' (. '/(\[\]\[\'\\x63)(.*)(\(\))/ms' () '/(\[\]\[\'\\x63)(.*)(\(\))/ms' But it didn't work - chuikoff
    • @chuikoff, What and where should be cut from? Only between commas? Commas? With brackets? Without? Create a question, I will answer it with joy. - HA3IK