Branch Prediction

A New Prediction Method

On the 30th of November, our team has invented the new generation of branch predictor, it is way more effective than anything that came before.

This algorithm based on a Neural Network, it is the closest to how a human learns from previous mistake to improve it's future decision process.

The predictor consistently outperforms the current state-of-the-art branch predictor, achieving an accuracy of 99.3% compared to 90%. Notably, we accomplished this without any reinforcement learning.

This is a significant milestone because

It marks the first time in the branch prediction field that a Neural Network method becomes the state-of-the-art.
The last breakthrough in the field was 18 years ago.

We hope to apply the same approach to solve very different problems soon. It is easy to imagine this method being applied to environments that increasingly resemble the real world, perhaps in the stock market or predicting the next token in a large language model.

Below is more detailed descriptions for the ones who want to learn more about branch prediction and our NN based predcitor in general:

On Branch Prediction

Single-threaded computers are limited by two big things:

How predictable the path of the branches is.
How predictable the locality of data is.

Branch Predictors are what we use to figure out the path of branches. This is a big deal at the lowest levels of software and hardware.

Branch Predictors are super important for making computers faster and keeping things efficient. While it doesn’t directly drive Moore’s Law (the idea that computing power doubles every 1.5 years), it does help us get the most out of the extra transistors we’re packing into chips every year. Branch Predictor Architecture

On 1th December, "Kwun's Cousins" from Sheffield University has invented a Neural network branch predictor that achieves an unprecedented accuracy.

The predictor is used compete in the Branch Prediction competition of Huawei. Here is the order of top teams from 1st place to 4th.

Kwun's cousins - Sheffield University
Honor 9 eLite - University of Southampton
Rookie Bug - Imperial College London
XSH - Oxford Unversity

The Neural network-based branch predictor has an accuracy of 99.3%.

In comparison, the state-of-the-art predictor, TAGE-SC-L achieves around 84.6% accuracy on the same trace set.

Additonally, the neural network-based predictor uses far less memory and requires less computational instructions. In other words, this could very likely be the future state-of-the-art branch predictor that will improve the performance of all computers.

Essence

The core concept of the Neural Network predictor is simple: never repeat the same mistake, it rarely makes the same mistake twice.

Currently, the industry appraoch for preceptron-based predictor is the multi-perspective preceptron predictor, although effective, yet it uses a lot of memory, and it's accuracy is nowhere near state of art predictor TAGE-SCL.

Our predictor is simple. It is a single perspective predictor. And it is super simple. It is within 200 lines of code. TAGE-SCL on the other hand, has 2000.

The neural network predictor comprised of three parts.

The Predictor Itself: A neural network that assigns weights based on the history of branch outcomes.

Every time it makes a prediction, it adds up the weight based on past outcome as well as its specific memory of the current branch address.

It makes predictions based on whether the sum of weights is above or below a predefined threshold.

Branch Predictor Architecture 2. Cluster Group Tags: This groups different branch addresses into tags based on their characteristics. Pre-processing before the run provides the predictor with extra information for more accurate predictions.

Unique Clustering Technique: Unlike anything currently in the field, this is inspired by a 2010 Nobel Prize winner: Andre Geim and Konstantin Novoselov who shocked the world by isolating graphene with something as simple as Scotch tape. The predictor was inspired by it and tried to recursively iteratively perform group tagging, each time build upon last taggings.

This is the first time anything similar has been done in the branch prediction field. Branch Predictor Architecture

This is a scenario where the product of parts is bigger than the sum of parts.

As a result of this new method, our branch predictor is the most effective and the most accurate branch predictor.

I have been invited to Huawei’s headquarters in Shenzhen on January 2025 to present the technical details of my implementation.