How to apply to the RNN input features represented by a list of vectors (equivalent to a sentence), and not with a single vector (equivalent to a word)?

Question

Now, for each word in the sentence, I calculate a vector representation (word2vec), multiply by TF-IDF and average, it turns out one vector of fixed dimension. And I want to give to the input not an average, but a sequence.

# Сейчас модель такая - 2 фичи - 2 вектора размерности 300 на входе r_model = Sequential() r_model.add(LSTM(200, input_shape=(2, 300), return_sequences=True)) r_model.add(LSTM(300, return_sequences=True)) ...

I want to supplement the lists of word vectors with zero vectors up to the maximum length - 105. And then submit this sequence to the input:

 # Хочется подавать на вход последовательность # max_len - максимальная длина списка векторов max_len = 105 r_model = Sequential() r_model.add(LSTM(200, input_shape(2, max_len, 300), return_sequences=True)) r_model.add(LSTM(300, return_sequences=True)) ...

But keras says that a shape of 3 is required, not 4, as in the second case:

 Input 0 is incompatible with layer lstm_5: expected ndim=3, found ndim=4

merletta merletta 463 five eleven · Answer 1 · 2017-05-27T00:16:08

Any neural network, including a recurrent one, expects a fixed dimension object as an input (and also produces a fixed dimension object as an output).

In order to feed exactly the sequence and get the sequence at the output, you can go through the data (in your case, the text) with a sliding window of a fixed size, or, if there are less data, you can finish off with zeros in some way to the desired dimension.
Another method you can use is to combine vector representations into one fixed-length structure that will behave like a Bloom filter .
As far as I know, such a technique is used in chemoinformatics to quickly search for molecules with the desired properties and should be applicable to a quick search for texts: in the case of a molecule, the word, expressed using a vector, will be a pharmacophoric fragment, and the text will be the whole molecule (if It is interesting for you to understand what it is and to form your opinion on the correctness of this analogy, then you can read about pharmacophores, for example, at geektimes ).
But as an input to the neural network, the Bloom filter may not give a very good result due to false positives, and in principle this method will be similar to the one that you are currently using, while it will not take into account the weight of words in the text. Therefore, I think that it would be more appropriate for you to use a sliding window.

I have max_len = 105 in the code, I meant to add all the other word sequences.
Thanks anyway, it was interesting to read about Bloom's filter.

How to apply to the RNN input features represented by a list of vectors (equivalent to a sentence), and not with a single vector (equivalent to a word)?

1 answer 1

More articles: