In the body there is a text for example:

<body> Какой то текст </body> 

When I take it to Java script and break it into separate words for example:

 <script> var bodyHTML = document.getElementsByTagName('body')[0].innerHTML; var bodyHTML = bodyHTML.split(' '); console.log(bodyHTML); </script> 

I see an array in the console

 ["↵Какой", "то", "текст↵↵"] 

Those. the first word is preceded by the arrow of the Enter key ↵ and two of the same arrows after the last word. How to remove them with Java script without changing the body html code?

    2 answers 2

    Already found, like this:

     bodyHTML = bodyHTML.replace(/[\n\r]+/g, ''); 

      Another option is to use innerText or textContent properties .

       var bodyHTML = document.getElementsByTagName('body')[0].innerText; var bodyHTML = document.getElementsByTagName('body')[0].textContent; 

       var b = document.body; var ih = b.innerHTML.split(' ').filter(function(el){return el.length<5 && el.length>0}); var it = b.innerText.split(' ').filter(function(el){return el.length<5 && el.length>0}); var tc = b.textContent.split(' ').filter(function(el){return el.length<5 && el.length>0}); document.write( '<br/>innerHTML: ' + JSON.stringify(ih), '<br/>innerText: ' +JSON.stringify(it), '<br/>textContent: ' +JSON.stringify(tc) ); 
       asds gasd sgdf gsdf g 

      • innerText - yes, textContent - does not help - stckvrw
      • @stckvrw, it seems yes, added an example of what each function displays. which is typical - the conclusion is different for all three - Grundy