Concatenating metadata with keras embeddings
Keras Embeddings RNN Deep LearningIntroduction #
Keras is a great framework and it lets you build and prototype deep networks fast, but sometimes when you try to customize some aspects of a model that is not supported out of the box you may experience pain.
This happened to me this week.. I'm currently building a statefull recurrent network and for the data I am modelling it makes sense to feed some metadata along with embeddings into the RNN, and Keras did not like this at all. I tried a bunch of online solutions and I asked a question on stackoverflow{:target="_blank"} but I couldn't find a solution, so I'm writing this article with the problem I encountered and how to get around it in case you want to model something similar.
The problem #
Quick note about my environment. I am running Keras 2.0 with tensorflow 1.0.1, Python 3.5 on a windows environment.
Let's start by importing all the keras bits we'll need
import keras
from keras.models import Model
from keras.layers import *
from keras.optimizers import Adam
from keras.engine.topology import Layer
import keras.backend as K
Next we'll create our input tensors. In this example we have two inputs;
input
will be turned into an encoding,input2
is a tensor containing integer metadata that we'll feed as-is into our RNN.- I'm using the NN to model a hierarchial structure. These are represented as timesteps, their size controlled with
frames
- to keep this example simple
batch_size
is set to 1. Also in other online examples I came across, it gets complicated when you want to feed multiple batches at a time and thus avoided.
batch_size = 1
frames = 3
input = Input(batch_shape=(batch_size,3,1))
input2 = Input(batch_shape=(batch_size,3,5))
inputEmb = Embedding(50,10,input_length = 1)(input)
Running this code will create Tensors with the following sizes:
input
(1, 3, 1) ->inputEmb
(1, 3, 1, 10)intput2
(1, 3, 5)
And when we concatenate the two tensors
encodedInput = Concatenate()([inputEmb, input2])
BOOM
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\marco\Anaconda3\lib\site-packages\keras\engine\topology.py", line 521, in __call__
self.build(input_shapes)
File "C:\Users\marco\Anaconda3\lib\site-packages\keras\layers\merge.py", line 153, in build
'Got inputs shapes: %s' % (input_shape))
ValueError: `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(1, 1, 10), (1, 3, 5)]
The dimensions keras is complaining about are somewhat different to their actual shapes. If you look at the shapes again you will notice that the embedding has a dim of 4 and the input a dim of 3 so I thought let's just reshape input2
to have the same number of dimensions.
inputReshaped = Reshape((3,1,5))(input2)
encodings =[inputTagEnc, inputReshaped]
encodedInput = Concatenate()(encodings)
inputReshaped.shape
>>> TensorShape([Dimension(1), Dimension(3), Dimension(1), Dimension(5)])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\marco\Anaconda3\lib\site-packages\keras\engine\topology.py", line 521, in __call__
self.build(input_shapes)
File "C:\Users\marco\Anaconda3\lib\site-packages\keras\layers\merge.py", line 153, in build
'Got inputs shapes: %s' % (input_shape))
ValueError: `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(1, 1, 10), (1, 3, 1, 5)]
Not good... Now I tried multiple reshapes, concatenate commands and lambda layers but nothing worked.. After lots of trail and error and looking at the source code I noticed something. The Embedding layer in keras is designed with RNNs in mind; layers consuming an embedding somehow unroll the timeframe and consume it sequentially which makes perfect sense for a RNN but when you concatenate it with a standard input it does not get unroll and bad things happen.
The solution #
Luckily writing a custom layer ignores this subtlety so I was able to create a special concatenate layer that does the operation.
class ConcatBatch(Layer):
def __init__(self, **kwargs):
super(ConcatBatch, self).__init__(**kwargs)
def build(self, input_shape):
super(ConcatBatch, self).build(input_shape) # Be sure to call this somewhere!
def call(self, x):
a, b = x[0], x[1]
b2 = K.reshape(b,(batch_size,frames,1,5))
encodingsReshaped = [a, b2]
encodedInput = K.concatenate(encodingsReshaped,axis=3)
out = K.reshape(encodedInput,(batch_size,frames,15))
return out
def compute_output_shape(self, input_shape):
return (batch_size, frames, 15)
I don't think this is a really elegant solution but it works. With a bit more work this custom layer can be a bit more versatile but the current implementation works with fixed sizes[^1] .
init
and _build_
lets you create layers with their own custom weights and other wizardry but here we are only interested in the call
method in order to do some backend operations on our tensors.
We unroll our list, and use lower level operations backend.reshape
and backend.concatenate
. Now executing encodedInput = ConcatBatch()(encodings)
gives us one nice concatenated vector.
Here are the tensor shapes for comparison:
a
(1, 3, 1, 10)b
(1, 3, 1, 5) ->b2
(1, 3, 1, 5)[^2]encodedInput
(1, 3, 1, 15) (RNNs dont like 1,15 shapes so we resize to)out
(1, 3, 15)
Snap on a RNN and output layer and you are ready to train your model!
rnn = LSTM(50,
dropout=0.0,
activation='relu',
recurrent_dropout=0.0,
stateful=True)(encodedInput)
output = Dense(3, activation='softmax')(rnn)
Full Code Sample #
import keras
from keras.models import Model
from keras.layers import *
from keras.optimizers import Adam
from keras.engine.topology import Layer
import keras.backend as K
class ConcatBatch(Layer):
def __init__(self, **kwargs):
super(ConcatBatch, self).__init__(**kwargs)
def build(self, input_shape):
super(ConcatBatch, self).build(input_shape) # Be sure to call this somewhere!
def call(self, x):
a, b = x[0], x[1]
b2 = K.reshape(b,(batch_size,frames,1,5))
encodingsReshaped = [a, b2]
encodedInput = K.concatenate(encodingsReshaped,axis=3)
out = K.reshape(encodedInput,(batch_size,frames,15))
print(a.shape)
print(b.shape)
print(b2.shape)
print(encodedInput.shape)
print(out.shape)
return out
def compute_output_shape(self, input_shape):
return (batch_size, frames, 15)
batch_size = 1
frames = 3
input = Input(batch_shape=(batch_size,3,1))
input2 = Input(batch_shape=(batch_size,3,5))
inputTagEnc = Embedding(50,10,input_length = 1)(input)
encodings =[inputTagEnc, input2]
encodedInput = ConcatBatch()(encodings)
rnn = LSTM(50,
dropout=0.0,
activation='relu',
recurrent_dropout=0.0,
stateful=True)(encodedInput)
output = Dense(3, activation='softmax')(rnn)
model = Model(inputs=[input,input2], outputs=[output])
model.compile(loss='categorical_crossentropy', optimizer=Adam())
model.summary()
Bonus #
One nice thing I found about this approach is that passing multiple batches is trivial. Increase the batch size and everything keeps working, something I was not able to do with some other examples I found on the interwebz.
Output example with a batch_size of 10
Layer (type) | Output Shape | Param # input_24 (InputLayer) | (10, 3, 1) | 0 embedding_13 (Embedding) | (10, 1, 10) | 500 input_25 (InputLayer) | (10, 3, 5) | 0 concat_batch_9 (ConcatBatch) | (10, 3, 15) | 0 lstm_5 (LSTM) | (10, 50) | 13200 dense_6 (Dense) | (10, 3) | 153
[^1]: My Final version actually takes a list of 8 embeddings and one metadata tensor and does the concatenatination so it is even dirtier but I stuck to 1 embedding and 1 metadata in this article for simplicities sake. [^2]: In my original code b is actually (1, 3, 5) so the resize in this instance is not really needed