What is Artificial Neural Network in Machine learning

Share:
In this article, we’ll try to cover things related to Artificial Neural Networks or ANN and Implement using Python.
Artificial Neural Networks or ANN is computational system inspired by the Structure, Processing Method and Learning Ability of a biological brain.
An ANN is based on a collection of connected units or nodes called artificial neurons that loosely shape the neurons in the biological brain. Each connection, like the synapses in the biological brain, can send a signal to other neurons. An artificial neuron that receives a signal, then processes it and can signal the neurons that are connected to it.

Table of Contents:
    1.Neuron
    2.Facts
    3.ANN Layers
    4.Activation Function
    5.Gradient Descent
    6.Code in Python


Neuron:-
Human Neurons
Cell Body : The neuron cell body contains the nucleus and performs the biochemical transformation necessary for the life of the neurons.

Dendrites : Every neuron has fine, hair-like tubular structures (extensions) around it. They 're going to branch out into a tree around the cell body. They 're accepting incoming signals.
Axon : It's a long , thin, tubular structure that works like a transmission line.

Synapse : Neurons are linked together in a complex spatial arrangement. When axon reaches its final destination, the terminal arborization is again called a branch. At the end of the axon there are highly complex and specialized structures called synapses. At these synapses, the connection between two neurons takes place. 

Facts :-

Human brain  Intel Pentium 4 1.5GHz
The brain contains about 10^10 (100 billion)
 basic units called neurons. Each neuron
connected to about 10^4other neurons.

 Number of transistors 4.2x10^7
Weight: at birth~ 0.3 kg, adult ~1.5 kg Weight: 0.1 kg cartridge w/o fans, 0.3 kg with fan/heatsink

Power consumption 20-40W (~20% of body consumption).

 Power consumption up to 55 Watts

Operating temperature: 37±2 oC

Operating temperature 15-85°C

Frequency of a neuron
 ~250 – 2000Hz

Frequency 1.5 GHz

Sleep requirement: average 7.5 hours (adult)

Sleep requirement 0 (if not overheated/ overclocked)

 Below figure show the working of  Human and Artificial Neuron

Artificial Neural Network(ANN) Layers:-
Artificial Neural network is typically organized in layers. Layers are being made up of many interconnected ‘nodes’ which contain an ‘activation function’. A neural network may contain the following 3 layers:


INPUT LAYER:-The purpose of the input layer is to receive the values of the explanatory attributes for each observation as input. The number of input nodes in the input layer is usually equal to the number of explanatory variables. 'Input layer' presents patterns to a network that communicates to one or more 'hidden layers.' 
The input layer nodes are passive, which means they do not change the data. They receive a single value on their input and duplicate the value on their many outputs. It duplicates each value from the input layer and sends it to all hidden nodes.

HIDDEN LAYER:-Hidden layers apply transformations to input values within the network. In this, incoming arcs that go from other hidden nodes or input nodes connected to each node. Connects outgoing arcs to output nodes or other hidden nodes. In the hidden layer, the actual processing is done by means of a weighted 'connection' system. 
There might be one or more hidden layers. Values that enter a hidden node multiplied by weights, a set of predetermined numbers stored in the program. The weighted inputs are then added in order to produce a single number.

OUTPUT LAYER:-The hidden layers link to the 'output layer.' The output layer receives connections from hidden layers or the input layer. Returns the output value that corresponds to the response variable forecast. There is usually only one output node in classification problems. 
The active nodes of the output layer combine and change the data to generate the output values. The ability of the neural network to provide useful data manipulation lies in the proper choice of weights. This is different from the conventional processing of information.

ACTIVATION FUNCTION:-

Activation function decides, whether a neuron should be activated or not by calculating weighted sum and further adding bias with it. The purpose of the activation function is to introduce non-linearity into the output of a neuron.
The input is fed to the input layer, the neurons perform a linear transformation on this input using the weights and biases.
                           x = (weight * input) + bias
Then, an activation function is applied on the above result.
                        
Finally, the output from the activation function moves to the next hidden layer and the same process is repeated. This forward movement of information is known as the forward propagation.
What if the output generated is far away from the actual value? Using the output from the forward propagation, error is calculated. Based on this error value, the weights and biases of the neurons are updated. This process is known as back-propagation.

Popular types of Activation Function:-
Activation
Funtions
   Equation Range Uses Nature
Relu A(x) = max(0,x)  0 to ∞ ReLu is less computationally
 expensive than tanh and
 sigmoid
 Non-   linear

Tanh

 tanh(x)=2*sigmoid(2x)-1
 
 -1 to +1

 Usually used in hidden layers of a neural   network as it’s values lies between -1 to 1

 Non-   linear

Softmax

 z=np.exp(x)
 z= z/z.sum()

   2+ 

 Usually used when trying to handle multiple   classes

 Non-   linear


Sigmoid
 

  A=1/(1 + e-x)
 
 
  0 to 1

 Result can be predicted easily to be 1 if value   is greater than 0.5 and 0 otherwise.

  Non-   linear
GRADIENT DESCENT:-
Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of the steepest descent as defined by the gradient negative. In machine learning, we use gradient descent to update our model's parameters. Parameters refer to Linear Regression coefficients and neural network weights.

Cost Function: A Loss Functions tells us "how good" our model is when it comes to making predictions for a set of parameters. The cost function has its own curve and gradients. The slope of this curve tells us how to update our parameters in order to make the model more accurate.
There are a number of gradient descent algorithms out there. I’ll mention a few below:    
    Batch Gradient Descent
    Stochastic Gradient Descent
    Mini-batch Gradient Descent

Code in Python:-
After all this theory part, lets do some practical. First import basic packages and dataset. You can freely use any available dataset , here I use MNIST handwritten digit(0-9) dataset.
Get 👉Source Code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from keras.datasets import mnist
#split the Datset into Train and Test
(x_train, y_train), (x_test, y_test)=mnist.load_data()
Data Pre-processing
x_train.shape
#(60000, 28, 28)
x_test.shape
#(10000, 28, 28)
MNIST dataset have 70k element , after splitting into train and test. For Training it has 60k element and for Testing purpose it has 10k . (28,28)  means 28 is length and 28 is width of image.
x_train[59999]
#Output in Array

plt.imshow(x_train[59999])
#Show in Fig 

y_train[59999]
#8
y_train.shape
#(60000,)

y_test.shape
#(10000,)
For Calculation and Understanding Purpose reshape the data
x_train = x_train.reshape(60000,784)

x_train.shape
#(60000, 784) instead of 28*28 the array is 1*784
x_test = x_test.reshape(10000,784)

x_test.shape
#(10000, 784) instead of 28*28 
#the array is 1*784
Scaling the Data(one single range)
import keras as ks

x_train=ks.utils.normalize(x_train)

x_train[59999]
x_test = ks.utils.normalize(x_test)

x_test[0]
OneHotEncoder
y_train=ks.utils.to_categorical(y_train)

y_train
y_test=ks.utils.to_categorical(y_test)

y_test[0]
#array([0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], dtype=float32)
Now Building Neural Network
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
Adding Layers
#Input Layer
model.add(Dense(input_dim=x_train.shape[1],kernel_initializer="random_uniform",activation="relu",units=250))

#Hidden Layer
model.add(Dense(kernel_initializer="random_uniform",activation="relu",units=150))

#Output Layer
model.add(Dense(kernel_initializer="random_uniform",activation="softmax",units=10))
input_dim = No. of Neurons in Input layer ie 784, now you understand why we convert multi-Dim into single Dimension. kernel_ initializer used for assign weight and activation funtion is RELU  and Units is 250 which means No. of Neurons in next layer that's why i don't use input_dim in hidden layer
There is 0 to 9 digit, total 10 digit that's why i use SOFTMAX as activation function and Units = 10
Compile the Neural Network
model.compile(loss="categorical_crossentropy",optimizer="adam",metrics=["accuracy"])
Train the Neural Network
model.fit(x_train,y_train,epochs=10,batch_size=32)
Epochs,Batch_size and Units in Input and Hidden are Hyper-parameters which means you can tune this parameters for better accuracy. 
Training can take time its depend upon the software and hardware.
Prediction
pred = model.predict(x_test)

pred = pred>0.5

pred
y_test
Compare the pred and y_test and see our hardwork.
Accuracy
from sklearn.metrics import accuracy_score

y_score(y_test,pred)
#97.84
You  can increase accuracy by tuning hyper-parametres
Congratulation you can make your own Model and for any doubts comment.THANK YOU FOR READING

2 comments: