But how well does this work? Is it possible to have 100% efficiency? Well, no. Sometimes, your software will do something the processor wasn't built to predict, and the data will not be in the L1 cache. This means that the processor will ask the L1 cache for the data, and this L1 cache will see that it doesn't have it, losing time. It will then ask the L2, the L2 will ask the L3, and so on. If you are lucky, it will be on the second level of cache, but it might not even be on the L3 and thus your program will need to wait for the RAM, after waiting for the three caches.
This is what is called a cache miss. How often does this happen? Depending on how optimized your code and the CPU are, it might be between 2% and 5% of the time. ...