There are 400,000 xml files. Most of the weight does not exceed 2KB. The Java application should read them from disk, process (now using stax parser) and load them into various collections.
How many threads should I use for this purpose? Some people say that it is inefficient to use more than one stream for reading from a disk, others, on the contrary, include quite a lot of streams.
Added:
@Arhad @KoVadim @Monk if you run the program for the first time (ie, there is no warmup), then this process takes as much as an hour for me, on other machines - no more than 15 minutes. After several launches, the processing takes me about 15 minutes too. Yesterday I tried forkjoinpool (each folder goes to a separate thread) and fixedthreadpool (just throwing all the files in the form of a runnable task in a row) - zero sense.
Exhibited 4 threads (so many cores from the processor). Maybe you can somehow increase the amount of one-time information feed from disk to memory, the files are defragmented, you have to go directly on the media in a row, if I understand correctly.
About the separation of logic: I thought about it at the very beginning. But it can hardly be applied to the stax parser.