Faced a problem in Python. The task is to parse JSON from the server. Object size 4-10GB. I do not know why people did it, I do not approve of this approach. :) You cannot request a part of JSON, only entirely. I use requests and send a POST request for receiving JSON. It loads for some time. When I try to parse it with the json () method or even text, I get a MemoryError, because the memory jumps sharply up just as much as JSON weighs (considering that the variable of the request itself also takes up in memory a volume equal to the size of JSON). Thus, it is impossible to parse.
1 answer
You can use the ijson
module for this purpose. It processes json with a stream, not a block, so there should be no problems with a lack of memory.
PS You can try json-streamer
. Judging by the description, it allows you to process a partial json-file, for example, when you have downloaded only a part of the whole file, and it needs to be processed.
- oneI doubt that Python object corresponding to 10GB json will occupy less space. So the author is better on the fly to react to events without recognizing json entirely, to take only the necessary information β both of the proposed modules probably know how (they use the same library behind the scenes). - jfs
|