I need to upload a lot of small images. Each letter of the alphabet contains about 2000 images, each 145x145 (each weighing about 400 bytes). Each image is loaded as follows:

from PIL import Image im = Image.open(filepath).convert('L') (width, height) = im.size greyscale_map = list(im.getdata()) greyscale_map = np.array(greyscale_map) greyscale_map = greyscale_map.reshape((height, width)) 

As a result, out of 14 MB of images, I get 14 GB of .npy objects. Can I do something with the size? Maybe somehow different download? In the future, these data will need to pass through a neural network.

  • Can you post links to several images so that you can test the code? What libraries do you plan to use for training (teaching) models? Does this library support sparse matrix ( csr_matrix )? - MaxU
  • Instead of saving images as uncompressed numpy arrays in npy files on disk, you can use the formats for images png, gif, djvu, jpeg, etc. - jfs
  • As an input, I use the uint8 type. This helped to reduce the size of almost 8 times. I plan to use lasagne - Tolkachev Ivan
  • Images like mnist. But except for the numbers, all the letters of the Russian alphabet are still - Tolkachev Ivan

0