Improve Neural Network Performance
A brief discussion on the improvement of Neural Network.
With the increase in data, the model should perform better and in many cases, it does perform and gives very good results, but a major problem faced by models is the saturation of accuracy after a certain point. For example, our model gets an accuracy level of 85% which is a very good number, but our target is to reach an accuracy level of 92 %. What steps we can take in order to improve the model?
In order to improve the performance, we need to improve the value of hyperparameters in the algorithms. Hyperparameters are the value assigned by the user to run the algorithm or the model. Below are some of the parameters that can help us achieve a better result.
- The number of Hidden Layers: In any neural network, we have 3 types of layers. An input layer, a Hidden layer, and an output layer. It's always good to have a number of hidden layers to get a better result. In order to check the actual number of hidden layers needed, it's better we start checking for overfitting after the addition of every hidden layer. Once we start getting the hidden layer, we can stop adding another hidden layer.
- The number of neurons in each layer: We can improve the performance of the model by assigning a specific number of neurons to the model. For the input layer, the number of neurons is always fixed, i.e. the number of columns. The output neuron will output in classification and regression problems, but in the case of multi-classification, the output depends on the number of classifications needed. For the hidden layer, there is a fixed formula to add the neurons, but we need to add enough neurons so that the model works smoothly. Again, in case of over-fitting, we can start decreasing the count.
- Batch size: We have 3 types of gradient descent. Batch, Stochastic and Mini batch. In batch gradient, the weight is updated after all rows, but in Stochastic it's done after every row. Batch Gradient descent is an intermediate of their 2 where we decide the number of rows and after that, we update the weight. We have 2 options for the batch size. A small number like 32 or a big number like 8000. In small, the training will be slow but comparatively better results, while in large it will be fast. A solution to this is warming up the learning rate where we increase the learning rate with respect to batch size, i.e. for small batch size small learning rate while for others it will be higher.
- Epochs: For finding the perfect epochs, we can use early stopping where once the accuracy turns stable for the model, then the epochs will be stopped. But during the initialization, we have a maximum number of epochs.
Problem while training Neural Network and how to fix them.
- Vanishing and exploding gradient: In vanishing gradient the value of derivative of weight reaches to a very minimum number where it becomes negligible while in exploding gradient the same number becomes very large. These problems occur mainly due to activations function like sigmoid, which can be resolved using ReLU and Leaky activation.
- Not Enough Data: Deep learning models are data hungry, and they need huge amount of data, but there are chances where we don't have enough data. This can impact the performance of the model. This can be rectified using Transfer learning, which means we can use someone else pre-trained model on our data.
- Slow training of Model: Many times the model gets trained very slowly. In this case, we can use Optimizers or learning rate scheduler to improve the training.
- Overfitting: L1 and L2 regularization can be used in case of overfitting or dropouts can be used to rectify this problem.
This is a very brief discussion on how to improve Neural network because while training most of the people encounter this issue.