How do I calculate all exponentiations from x_min ^ y_min to x_max ^ y_max without repeating the results. For example:

  • x2 ^ x8 = z
  • ...
  • x85 ^ y23 = p
  • ...
  • but x85 ^ y28 = z !

As you can see, the result of z is repeated for different calculations, how can we skip all the following calculations , provided that it is not permissible to compare with previously calculated results ?

# code python 3 min_x, max_x = 5, 127 min_y, max_y = 2, 250 result = '' for x in range(min_x, max_x): for y in range(min_y, max_y): result += str(x ** y) 
  • why is unacceptable? where are these requirements from? And yes, what exactly is forbidden to do? - pavel
  • Because it takes a very long time . Since the values ​​of x and y can be much greater than 1,000, and all calculated results are stored in the result variable. - sergus
  • the only question is performance? - MaxU
  • Yes and no. I also do not want to waste extra memory, because it is planned to use at least 2 gigabytes only for storing the result variable. Is there any formula mat for solving this problem? Just imagine how much time to look for a substring of at least 300MB .. :) - sergus
  • Well, 300 meters is a bit) calculate everything, sort and delete duplicates) - pavel

2 answers 2

UPDATE: - just checked with your numbers - Python "occupied" about 1GB of memory, timing I updated for your numbers

 In [105]: x = np.arange(5, 8800, dtype=np.uint64) In [106]: y = np.arange(2, 8250, dtype=np.uint64) In [107]: r = np.unique(np.array([np.power(x, np.repeat([pw], len(x))) for pw in np.nditer(y)])) In [108]: len(''.join(r.astype(str))) Out[108]: 457656 In [109]: %timeit np.unique(np.array([np.power(x, np.repeat([pw], len(x))) for pw in np.nditer(y)])) 1 loop, best of 3: 11.3 s per loop In [115]: np.version.version Out[115]: '1.10.4' 

numpy solution:

 x = np.arange(5, 127, dtype=np.uint64) y = np.arange(2, 250, dtype=np.uint64) r = np.unique(np.array([np.power(x, np.repeat([pw], len(x))) for pw in np.nditer(y)])) # чтобы сэкономить память записываем строку в `r` r = ''.join(r.astype(str)) print(r) 

Timing for the 5000 x 5000 array:

 In [102]: %timeit np.unique(np.array([np.power(x, np.repeat([pw], len(x))) for pw in np.nditer(y)])) 1 loop, best of 3: 3.89 s per loop In [103]: x.shape Out[103]: (4995,) In [104]: y.shape Out[104]: (4998,) 
  • Small numbers work really fast! But with such data, momory error generally crashes: x = np.arange (5, 8800, dtype = np.uint64); y = np.arange (2, 8250, dtype = np.uint64) How can this be avoided?) Memory for eyes, maybe this is a numpy error? - sergus
  • I changed the solution a bit - np.repeat() instead of np.array() - this should save memory. What is your numpy version? - MaxU
  • Checked the new code, the python process reaches 750MB and falls on the same error. PyCharm last upd, Python 3.4.4 x32, numpy 1.11.0 - sergus
  • Памяти хоть за глаза and x32 are not exactly compatible concepts;) Try to close extra applications, including IDEA and run the script from com. lines - MaxU
  • I meant that it should work out its honest 3 GB :) I ran the script from CMD on python 2.7, but did not close anything and it worked! In 3.x I rolled back to 1.10.4 but still the error: File "C: \ Users \ .. \ AppData \ Roaming \ Python \ Python34 \ site-packages \ numpy \ lib \ arraysetops.py", line 176, in unique ar = np.asanyarray (ar) .flatten () MemoryError - sergus

Since the main problem of the author is the lack of resources (or is there some abstract problem?), We can offer a solution: do not generate all the values ​​at once, but write a generator that will produce values ​​when it is need to. After all, these values ​​will still be used somewhere — be it to write to a file or some operations on the resulting vector — and they will be used one by one.

 def give_me_square(): first = range(5, 8800) second = range(2, 9000) for i in first: for j in second: yield i**j counter = 0 total = 8800*9000 for square in give_me_square(): if counter % 10000 == 0: print(counter, "of", total) counter += 1 

It works quite reliably, but it does not eat memory. Such a generator can be made asynchronous, you can give it to multiprocessing.Pool.map, a lot of things you can think of.

  • By the way - this is a reliable option. only need to add filtering on duplicates - MaxU
  • Yes, I just rejected such an option due to a long search and deletion of duplicates :) I just need all the values ​​at once, they will not be written to the file, they are stored only in virtual memory - sergus
  • one
    @sergus, how is that all? How are you going to continue to use the list of all non-repeating values? If you store all these numbers - no matter how exactly, then you can say goodbye to the dream that everything fits in the memory. The complexity of the memory - O (n * m) - is, firstly. Secondly, 1000 to the power of 1000 is a huge number — much more than int64 (2 ** 64). When calculating such a number, arithmetic is already used on the lines - and this is a completely different conversation on the part of speed and memory, because it is “cheap” to work with such numbers, it doesn't work. - m9_psy
  • Yes, "I did not notice the elephant" ... numpy - it will not be able to calculate correctly 1000 ^ 1000. Here is an example: x = np.array([1000], dtype=np.uint64); np.power(x, x) x = np.array([1000], dtype=np.uint64); np.power(x, x) Output: array([9223372036854775808], dtype=uint64) - MaxU
  • Thanks for the answer! I need to search for numbers in this array. As I wrote above in the comments, I think to divide into portions and check the substring. Oh, what a nuisance. That is, now the script does not calculate 1000 ^ 1000? What advise in this case? I am not familiar with numpy, do not kick much :) - sergus