Neural Networks
Neural networks are the building blocks of Deep Learning. They are computing systems loosely inspired by the human brain, made of many small units called neurons that work together to learn patterns from data. Where our earlier Deep Learning article gave the big picture, this article looks inside a neural network β at the single neuron and how thousands of them combine into a learning machine.
π‘ In one line: A neural network is a web of connected "neurons" that pass signals to each other, adjusting their connections to learn patterns from data.
The Neuron: The Building Block
Everything in a neural network is built from a single unit: the neuron (also called a perceptron). A neuron does three simple things:
- Takes inputs β one or more numbers (xβ, xβ, xβ β¦).
- Weights and sums them β each input is multiplied by a weight, the results are added together, and a bias is added.
- Applies an activation function β the sum is passed through a function that decides the neuron's final output.
In formula form:
output = f( wβxβ + wβxβ + wβxβ + β¦ + b )β¦where w are weights, b is the bias, and f is the activation function.
Weights and Biases
Weights decide how important each input is β a large weight means that input strongly influences the output. The bias lets the neuron shift its output up or down, giving it flexibility.
Crucially, weights and biases are not set by humans. They start as random values, and training adjusts them until the network produces good outputs. Learning, in a neural network, is the process of finding the right weights and biases.
Activation Functions
Without activation functions, a neural network could only learn straight-line (linear) relationships β no matter how many layers it had. Activation functions add non-linearity, letting the network learn complex, curved patterns.
| Function | Output range | Commonly used for |
|---|---|---|
| Sigmoid | 0 to 1 | Probabilities, binary output |
| Tanh | β1 to 1 | Hidden layers (zero-centred) |
| ReLU | 0 to β | The default for hidden layers β fast and effective |
| Softmax | 0 to 1 (sums to 1) | Multi-class output layers |
β‘οΈ Read full article: Activation Functions β covered in detail in its own topic.
Building a Network from Neurons
A single neuron is limited. The power comes from connecting many neurons in layers:
- Input Layer β receives the raw data.
- Hidden Layers β neurons that transform the data; more hidden layers make the network "deep."
- Output Layer β produces the final prediction.
In a fully connected network, every neuron in one layer connects to every neuron in the next, and each connection carries its own weight.
How a Neural Network Learns
Learning happens by repeating two passes many times:
- Forward Pass β data flows from input to output, and the network makes a prediction.
- Calculate Loss β the prediction is compared to the correct answer using a loss function.
- Backpropagation β the error is sent backwards through the network, showing how much each weight contributed to it.
- Gradient Descent β every weight is nudged slightly in the direction that reduces the error.
- Repeat β one full pass over the data is called an epoch; after many epochs, the weights settle into accurate values.
π In short: Forward pass makes a guess, backpropagation measures the mistake, and gradient descent fixes the weights β over and over.
Types of Neural Networks
| Type | Best for |
|---|---|
| Feedforward (ANN) | Basic classification and prediction |
| CNN (Convolutional) | Images and computer vision |
| RNN / LSTM | Sequences β text, speech, time series |
| Transformer | Language and code (powers LLMs and Gen AI) |
| GAN | Generating realistic images and content |
Pros and Cons of Neural Networks
| β Pros (Advantages) | β οΈ Cons (Challenges) |
|---|---|
| Learn complex, non-linear patterns | Need large amounts of data |
| Discover features automatically | Computationally expensive to train |
| Flexible across images, text, and audio | Hard to interpret ("black box") |
| State-of-the-art accuracy | Risk of overfitting |
| Improve as data grows | Many settings to tune (hyperparameters) |
Applications of Neural Networks
| Domain | Use |
|---|---|
| Vision | Face recognition, object detection, medical imaging |
| Language | Translation, chatbots, large language models |
| Speech | Voice assistants, speech-to-text |
| Finance | Fraud detection, risk prediction |
| Generative AI | Image, text, and audio generation |
Summary
- A neural network is made of neurons organised into input, hidden, and output layers.
- Each neuron computes a weighted sum of its inputs plus a bias, then applies an activation function.
- Weights and biases are learned during training; activation functions add the non-linearity needed for complex patterns.
- Networks learn through forward passes, backpropagation, and gradient descent, repeated over many epochs.
- Specialised types β CNNs, RNNs, and Transformers β power modern vision, language, and Generative AI.