# Relation Network in One-Shot Learning

Relation network is popularly used one shot learning algorithm. Relation network consists of two important functions: embedding function denoted by $f_{\varphi}$ and the relation function denoted by $g_{\phi}$. The embedding function is used for extracting the features from the input. If our input is an image, then we can use a convolutional network as our embedding function which will give us the feature vectors/embeddings of an image. If our input is a text, then we can use LSTM networks for getting the embeddings of the text.

As we know in one shot learning, we will have only a single example per class, let us say our support set contains three classes with one example per each class. As shown in the below figure, we have support set containing three classes {lion, elephant, dog}

And let us say we have a query image $x_j$ as shown in the below figure and we want to predict the class of this query image.

First, we take each image $x_i$ from the support set and pass it to the embedding function $f_{\varphi}(x_i)$ for extracting the features. Since our support set has images, we can use convolutional network as our embedding function for learning the embeddings. The embedding function will give us the feature vector of each of the data point in the support set. Similarly, we will learn the embeddings of our query image $x_j$ by passing it to the embedding function $f_{\varphi}(x_j)$.

So, once we have the feature vectors of the support set $f_{\varphi}(x_i)$ and query set $f_{\varphi}(x_j)$. We combine them using some operator $Z$. Here $Z$ can be any combination operator, we use concatenation as an operator for combining the feature vectors of support and query set.

As shown in the below figure we will combine the feature vectors of the support set $f_{\varphi}(x_i)$ and query set $f_{\varphi}(x_j)$. But what is the use of combining like this? It will help us to understand how the feature vector of an image in the support set is related to the feature vector of a query image. $Z(f_{\varphi}(x_i), f_{\varphi}(x_j))$

In our example, it will help us to understand how the feature vector of a lion is related to the feature vector of a query image, how the feature vector of an elephant is related to the feature vector of query image and how the feature vector of dog is related to the feature vector of query image.

But how can we measure this relatedness? So that is why we use a relation function $g_{\phi}$. We pass this combined feature vectors to the relation function which will generate the relation score ranging from 0 to 1 representing the similarity between samples in the support set $x_i$ and samples in the query set $x_j$.

The below equation shows how we compute relation score $r_{ij}$ in relation network,

$r_{ij} = g_{\phi} ( Z(f_{\varphi}(x_i), f_{\varphi}(x_j)))$

where $r_{ij}$ denotes the relation score representing similarity between each of the class in the support set and the query image. Since we have three classes in the support set and one image in the query set, we will have 3 scores indicating how all the three classes in the support set is similar to the query image.

The overall representation of relation network in one shot learning setting is shown in the below figure,

In the next section, we will learn how relation network is used in few shot and zero shot learning system.