In this article, we’ll try to cover things related to Artificial Neural Networks or ANN and Implement using Python.
Cell Body : The neuron cell body contains the nucleus and performs the biochemical transformation necessary for the life of the neurons.
There are a number of gradient descent algorithms out there.
I’ll mention a few below:
Artificial Neural Networks or ANN is computational system inspired by the Structure, Processing Method and Learning Ability of a biological brain.
An ANN is based on a collection of connected units or nodes
called artificial neurons that loosely shape the neurons in the biological
brain. Each connection, like the synapses in the biological brain, can send a
signal to other neurons. An artificial neuron that receives a signal, then
processes it and can signal the neurons that are connected to it.
Table of Contents:
1.Neuron
2.Facts
3.ANN Layers
4.Activation Function
5.Gradient Descent
6.Code in Python
Neuron:-
![]() |
Human Neurons |
Dendrites : Every neuron has fine, hair-like tubular structures (extensions) around it. They 're going to branch out into a tree around the cell body. They 're accepting incoming signals.
Axon : It's a long , thin, tubular structure that works like a transmission line.
Synapse : Neurons are linked together in a complex spatial arrangement. When axon reaches its final destination, the terminal arborization is again called a branch. At the end of the axon there are highly complex and specialized structures called synapses. At these synapses, the connection between two neurons takes place.
Facts :-
Human brain | Intel Pentium 4 1.5GHz |
The brain contains about 10^10 (100 billion) basic units called neurons. Each neuron connected to about 10^4other neurons. | Number of transistors 4.2x10^7 |
Weight: at birth~ 0.3 kg, adult ~1.5 kg | Weight: 0.1 kg cartridge w/o fans, 0.3 kg with fan/heatsink |
Power consumption 20-40W (~20% of body consumption). | Power consumption up to 55 Watts |
Operating temperature: 37±2 oC | Operating temperature 15-85°C |
Frequency of a neuron ~250 – 2000Hz | Frequency 1.5 GHz |
Sleep requirement: average 7.5 hours (adult) | Sleep requirement 0 (if not overheated/ overclocked) |
Below figure show the working of Human and Artificial Neuron
Artificial Neural Network(ANN) Layers:-
Artificial Neural network is typically organized in layers.
Layers are being made up of many interconnected ‘nodes’ which contain an
‘activation function’. A neural network may contain the following 3 layers:
INPUT LAYER:-The purpose of the input layer is to receive the values of
the explanatory attributes for each observation as input. The number of input
nodes in the input layer is usually equal to the number of explanatory
variables. 'Input layer' presents patterns to a network that communicates to
one or more 'hidden layers.'
The input layer nodes are passive, which means
they do not change the data. They receive a single value on their input and
duplicate the value on their many outputs. It duplicates each value from the
input layer and sends it to all hidden nodes.
HIDDEN LAYER:-Hidden layers apply transformations to input values within
the network. In this, incoming arcs that go from other hidden nodes or input
nodes connected to each node. Connects outgoing arcs to output nodes or other
hidden nodes. In the hidden layer, the actual processing is done by means of a
weighted 'connection' system.
There might be one or more hidden layers. Values
that enter a hidden node multiplied by weights, a set of predetermined numbers
stored in the program. The weighted inputs are then added in order to produce a
single number.
OUTPUT LAYER:-The hidden layers link to the 'output layer.' The output
layer receives connections from hidden layers or the input layer. Returns the
output value that corresponds to the response variable forecast. There is
usually only one output node in classification problems.
The active nodes of
the output layer combine and change the data to generate the output values. The
ability of the neural network to provide useful data manipulation lies in the
proper choice of weights. This is different from the conventional processing of
information.
ACTIVATION FUNCTION:-
Activation function decides, whether a neuron should be
activated or not by calculating weighted sum and further adding bias with it.
The purpose of the activation function is to introduce non-linearity into the
output of a neuron.
The input is fed to the input layer, the neurons perform a
linear transformation on this input using the weights and biases.
x = (weight * input) + bias
Then, an activation function is applied on the above
result.
Finally, the output from the activation function moves to
the next hidden layer and the same process is repeated. This forward movement
of information is known as the forward propagation.
What if the output generated is far away from the actual
value? Using the output from the forward propagation, error is calculated.
Based on this error value, the weights and biases of the neurons are updated.
This process is known as back-propagation.
Popular types of Activation Function:-
Activation Funtions | Equation | Range | Uses | Nature |
Relu | A(x) = max(0,x) | 0 to ∞ | ReLu is less computationally expensive than tanh and sigmoid | Non- linear |
Tanh | tanh(x)=2*sigmoid(2x)-1 | -1 to +1 | Usually used in hidden layers of a neural network as it’s values lies between -1 to 1 | Non- linear |
Softmax | z=np.exp(x) z= z/z.sum() | 2+ | Usually used when trying to handle multiple classes | Non- linear |
Sigmoid | A=1/(1 + e-x) | 0 to 1 | Result can be predicted easily to be 1 if value is greater than 0.5 and 0 otherwise. | Non- linear |
GRADIENT DESCENT:-
Gradient descent is an optimization algorithm used to
minimize some function by iteratively moving in the direction of the steepest
descent as defined by the gradient negative. In machine learning, we use
gradient descent to update our model's parameters. Parameters refer to Linear
Regression coefficients and neural network weights.
Cost Function: A Loss Functions tells us "how good" our model is
when it comes to making predictions for a set of parameters. The cost function
has its own curve and gradients. The slope of this curve tells us how to update
our parameters in order to make the model more accurate.
Batch Gradient Descent
Stochastic Gradient Descent
Mini-batch Gradient Descent
Code in Python:-
After all this theory part, lets do some practical. First import basic packages and dataset. You can freely use any available dataset , here I use MNIST handwritten digit(0-9) dataset.
Get 👉Source Code
import numpy as np import pandas as pd import matplotlib.pyplot as plt from keras.datasets import mnist
#split the Datset into Train and Test (x_train, y_train), (x_test, y_test)=mnist.load_data()
Data Pre-processing
x_train.shape #(60000, 28, 28)
x_test.shape #(10000, 28, 28)
MNIST dataset have 70k element , after splitting into train and test. For Training it has 60k element and for Testing purpose it has 10k . (28,28) means 28 is length and 28 is width of image.
x_train[59999] #Output in Array plt.imshow(x_train[59999]) #Show in Fig y_train[59999] #8
y_train.shape #(60000,) y_test.shape #(10000,)
For Calculation and Understanding Purpose reshape the data
x_train = x_train.reshape(60000,784) x_train.shape #(60000, 784) instead of 28*28 the array is 1*784
x_test = x_test.reshape(10000,784) x_test.shape #(10000, 784) instead of 28*28 #the array is 1*784
Scaling the Data(one single range)
import keras as ks x_train=ks.utils.normalize(x_train) x_train[59999]
x_test = ks.utils.normalize(x_test) x_test[0]
OneHotEncoder
y_train=ks.utils.to_categorical(y_train) y_train
y_test=ks.utils.to_categorical(y_test) y_test[0] #array([0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], dtype=float32)
Now Building Neural Network
from keras.models import Sequential from keras.layers import Dense model = Sequential()
Adding Layers
#Input Layer model.add(Dense(input_dim=x_train.shape[1],kernel_initializer="random_uniform",activation="relu",units=250)) #Hidden Layer model.add(Dense(kernel_initializer="random_uniform",activation="relu",units=150)) #Output Layer model.add(Dense(kernel_initializer="random_uniform",activation="softmax",units=10))
input_dim = No. of Neurons in Input layer ie 784, now you understand why we convert multi-Dim into single Dimension. kernel_ initializer used for assign weight and activation funtion is RELU and Units is 250 which means No. of Neurons in next layer that's why i don't use input_dim in hidden layer
There is 0 to 9 digit, total 10 digit that's why i use SOFTMAX as activation function and Units = 10
Compile the Neural Network
model.compile(loss="categorical_crossentropy",optimizer="adam",metrics=["accuracy"])
Train the Neural Network
model.fit(x_train,y_train,epochs=10,batch_size=32)
Epochs,Batch_size and Units in Input and Hidden are Hyper-parameters which means you can tune this parameters for better accuracy.
Training can take time its depend upon the software and hardware.
Prediction
pred = model.predict(x_test) pred = pred>0.5 pred
y_test
Compare the pred and y_test and see our hardwork.
Accuracy
from sklearn.metrics import accuracy_score y_score(y_test,pred) #97.84
You can increase accuracy by tuning hyper-parametres
Congratulation you can make your own Model and for any doubts comment.THANK YOU FOR READING
Very informative content!
ReplyDeleteGood
ReplyDelete