Concatenating metadata with keras embeddings

Introduction

Keras is a great framework and it lets you build and prototype deep networks fast, but sometimes when you try to customize some aspects of a model that is not supported out of the box you may experience pain.

This happened to me this week.. I'm currently building a statefull recurrent network and for the data I am modelling it makes sense to feed some metadata along with embeddings into the RNN, and Keras did not like this at all. I tried a bunch of online solutions and I asked a question on stackoverflow{:target="_blank"} but I couldn't find a solution, so I'm writing this article with the problem I encountered and how to get around it in case you want to model something similar.

The problem

Quick note about my environment. I am running Keras 2.0 with tensorflow 1.0.1, Python 3.5 on a windows environment.

Let's start by importing all the keras bits we'll need

import keras
from keras.models import Model
from keras.layers import *
from keras.optimizers import Adam
from keras.engine.topology import Layer
import keras.backend as K

Next we'll create our input tensors. In this example we have two inputs;

batch_size = 1
frames = 3

input = Input(batch_shape=(batch_size,3,1))
input2 = Input(batch_shape=(batch_size,3,5))
inputEmb = Embedding(50,10,input_length = 1)(input)

Running this code will create Tensors with the following sizes:

And when we concatenate the two tensors

encodedInput = Concatenate()([inputEmb, input2])

BOOM

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\marco\Anaconda3\lib\site-packages\keras\engine\topology.py", line 521, in __call__
self.build(input_shapes)
File "C:\Users\marco\Anaconda3\lib\site-packages\keras\layers\merge.py", line 153, in build
'Got inputs shapes: %s' % (input_shape))
ValueError: `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(1, 1, 10), (1, 3, 5)]

The dimensions keras is complaining about are somewhat different to their actual shapes. If you look at the shapes again you will notice that the embedding has a dim of 4 and the input a dim of 3 so I thought let's just reshape input2 to have the same number of dimensions.

inputReshaped = Reshape((3,1,5))(input2)
encodings =[inputTagEnc, inputReshaped]
encodedInput = Concatenate()(encodings)

inputReshaped.shape
>>> TensorShape([Dimension(1), Dimension(3), Dimension(1), Dimension(5)])

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\marco\Anaconda3\lib\site-packages\keras\engine\topology.py", line 521, in __call__
self.build(input_shapes)
File "C:\Users\marco\Anaconda3\lib\site-packages\keras\layers\merge.py", line 153, in build
'Got inputs shapes: %s' % (input_shape))
ValueError: `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(1, 1, 10), (1, 3, 1, 5)]

Not good... Now I tried multiple reshapes, concatenate commands and lambda layers but nothing worked.. After lots of trail and error and looking at the source code I noticed something. The Embedding layer in keras is designed with RNNs in mind; layers consuming an embedding somehow unroll the timeframe and consume it sequentially which makes perfect sense for a RNN but when you concatenate it with a standard input it does not get unroll and bad things happen.

The solution

Luckily writing a custom layer ignores this subtlety so I was able to create a special concatenate layer that does the operation.

class ConcatBatch(Layer):
def __init__(self, **kwargs):
super(ConcatBatch, self).__init__(**kwargs)

def build(self, input_shape):
super(ConcatBatch, self).build(input_shape) # Be sure to call this somewhere!

def call(self, x):
a, b = x[0], x[1]
b2 = K.reshape(b,(batch_size,frames,1,5))
encodingsReshaped = [a, b2]
encodedInput = K.concatenate(encodingsReshaped,axis=3)
out = K.reshape(encodedInput,(batch_size,frames,15))
return out

def compute_output_shape(self, input_shape):
return (batch_size, frames, 15)

I don't think this is a really elegant solution but it works. With a bit more work this custom layer can be a bit more versatile but the current implementation works with fixed sizes[^1] .

init and _build_ lets you create layers with their own custom weights and other wizardry but here we are only interested in the call method in order to do some backend operations on our tensors.

We unroll our list, and use lower level operations backend.reshape and backend.concatenate. Now executing encodedInput = ConcatBatch()(encodings) gives us one nice concatenated vector. Here are the tensor shapes for comparison:

Snap on a RNN and output layer and you are ready to train your model!

rnn = LSTM(50, 
dropout=0.0,
activation='relu',
recurrent_dropout=0.0,
stateful=True)(encodedInput)
output = Dense(3, activation='softmax')(rnn)

Full Code Sample


import keras
from keras.models import Model
from keras.layers import *
from keras.optimizers import Adam
from keras.engine.topology import Layer
import keras.backend as K

class ConcatBatch(Layer):
def __init__(self, **kwargs):
super(ConcatBatch, self).__init__(**kwargs)

def build(self, input_shape):
super(ConcatBatch, self).build(input_shape) # Be sure to call this somewhere!

def call(self, x):
a, b = x[0], x[1]
b2 = K.reshape(b,(batch_size,frames,1,5))
encodingsReshaped = [a, b2]
encodedInput = K.concatenate(encodingsReshaped,axis=3)
out = K.reshape(encodedInput,(batch_size,frames,15))
print(a.shape)
print(b.shape)
print(b2.shape)
print(encodedInput.shape)
print(out.shape)
return out

def compute_output_shape(self, input_shape):
return (batch_size, frames, 15)


batch_size = 1
frames = 3

input = Input(batch_shape=(batch_size,3,1))
input2 = Input(batch_shape=(batch_size,3,5))

inputTagEnc = Embedding(50,10,input_length = 1)(input)

encodings =[inputTagEnc, input2]
encodedInput = ConcatBatch()(encodings)

rnn = LSTM(50,
dropout=0.0,
activation='relu',
recurrent_dropout=0.0,
stateful=True)(encodedInput)
output = Dense(3, activation='softmax')(rnn)


model = Model(inputs=[input,input2], outputs=[output])
model.compile(loss='categorical_crossentropy', optimizer=Adam())
model.summary()

Bonus

One nice thing I found about this approach is that passing multiple batches is trivial. Increase the batch size and everything keeps working, something I was not able to do with some other examples I found on the interwebz.

Output example with a batch_size of 10

Layer (type) | Output Shape | Param # input_24 (InputLayer) | (10, 3, 1) | 0 embedding_13 (Embedding) | (10, 1, 10) | 500 input_25 (InputLayer) | (10, 3, 5) | 0 concat_batch_9 (ConcatBatch) | (10, 3, 15) | 0 lstm_5 (LSTM) | (10, 50) | 13200 dense_6 (Dense) | (10, 3) | 153

[^1]: My Final version actually takes a list of 8 embeddings and one metadata tensor and does the concatenatination so it is even dirtier but I stuck to 1 embedding and 1 metadata in this article for simplicities sake. [^2]: In my original code b is actually (1, 3, 5) so the resize in this instance is not really needed