Intel Core i7 - 965/920 - architecture and performance

Published by Jean-Luc Hadey on 03.11.08
Page:
« 1 ... 3 4 5 (6) 7 8 9 ... 16 »


TLBs and virtualisation

TLB

Nehalem also features changes in the TLB hierarchy wich come hand in hand with the changes in the cache hierarchy.


Resized Image


Nehalem now has a true two level TLB hierarchy which can be allocated dynamically between threads. The first level TLB serves all memory acceses and contains 64 entries for 4 KB pages as well as 32 entries for 2M / 4M pages whereas it keeps four way associativity. Further Nehalem contains a second level unified TLB with 512 entries for small pages which again kepps the four way associativity.

To allocate the whole cache every core has 576 entries for small pages and 2304 for the whole chip. The number of TLB entries makes the translation of 9216 KB possible which is more than enough for the 8 MB L3-Cache Nahlem comes with.


Virtualisation

Nehalems TLB can also access VPIDs (Virtual Processor ID). Every TLB entry caches the translation of a physical to a virtual address. The translation is specific to a given process and virtual machine. Earlier Intel CPUs needed to flush the TLB whenever it was switched between virtualized guest and host instance. Intel estimates that the latency for a VM round trip is 40 percent compared to Merom (65 nm Core 2).

A further improvement referring to virtualisation can be found when we look at the extended page tables. They are now able to eliminate many VM transitions and not only reduce the latency like VPID does. Earlier Intel designs needed a hypervisor which was handling page faults. Now the page tables can be simply compared which saves many unnecessary VM exits.


Discuss this article in the forum.




Navigate through the articles
Previous article Intel Core 2 Duo E7200 "Overclockingwunder" AMD Phenom II X4 955 BE, 3.2 GHz, (Deneb) Next article
comments powered by Disqus

Intel Core i7 - 965/920 - architecture and performance - CPUs > Reviews - Reviews - ocaholic