The following code simulates loading and parsing HTML. Pulls links and stores them in the global array.
If it saves the links as shown in the line "Option 1", then the node begins to eat memory. ~ 1GB for 30.000 links If you comment out this line and save it with the "Option 2" method, then everything is fine, the memory grows, but little by little, as the array of links increases, you can save a million of them.
I can not understand the difference, because in both cases a primitive string is stored in the array? Checked in all sorts of ways, calling $ (). Text () does not return an object in which he could add a link to the entire DOM and the source text, in addition, but a string. But the impression is that all this somehow clings and the garbage collector cannot clean the memory.
var cheerio = require('cheerio'); var _=require('lodash'); var links=[]; for (var i=0;i<100000;i++) { getS(); console.log('%s links collected',links.length); } function getS() { var body='<body>'; for(var i=0;i<20;i++) { body+='<a class="i-ljuser-username" href="#">'+Math.random()*100000+'</a>'; while (body.length<15000*i) { body+='<span>+Math.random()*100000+</span>'; } } body+='</body>'; var $=cheerio.load(body); var list=$('ai-ljuser-username'); _.forEach(list,function(item){ links.push($(item).text()); //Вариант 1 //links.push(($(item).text()+' ').trim()); //Вариант 2 }); }
$(item).text()+' 'you explicitly cast to the string that "forgets about where it is from." And in the first case, the virtual machine tries to save on copying and adds not just a string, but an object that, although it looks like a string, remembers “too much”. - KoVadim.text()can actually return something different from the regular string. - etki