Cache system intro
Contact us
Cache mapping method
Introduction
Bibliographiy
Workload Matrix
Gantt Chart
Minutes of meeting
Acknowledgement
Home
The merits/weak points of the cache system
Next >

Asia Pacific Institute of Information Technology

< Back
Copyright© 2000 All Rights Reserved. Computer System Research Group 7.
The robust multi-level, 512-entry, split TLB cache significantly improves performance of systems configured with large physical memory or storage, typically found in server environments, by caching all important translation information used by operating systems and application software that access large physical memory or storage. Thus, the cache architecture of the AMD Athlon processor enables high instruction execution rates by minimizing effective memory latency and system snoop responses, and it provides large spatial locality of data for transaction-based applications and multiprocessing operating systems. The architecture also supports high-bandwidth data transfers to and from the execution resources, and it contributes to significant performance gains and extremely fast operation of data-intensive software programs.
The AMD Athlon processor's cache architecture is the first to incorporate a system-based MOESI (Modify, Owner, Exclusive, Shared, Invalid) cache control protocol for x86 multiprocessing platforms. Since the system logic manages memory coherency throughout the system by specifying all cache state transitions, either using a MESI or MOESI cache coherency protocol, and by filtering out unnecessary processor snoops, AMD Athlon processors are designed to deliver exceptional performance in both uniprocessor and multiprocessor system configurations. The AMD Athlon processor cache architecture also supports error correction code (ECC) protection, which is a required feature for high reliability of business desktop systems, workstations, and servers. Thus, the AMD Athlon processor's cache architecture provides the features required for high-performance computing from desktop to server configurations.
The newest Pentium III processors include support for 100 and 133 MHz system bus and Advanced Transfer Cache featuring 256K on-die, full speed L2 cache plus Advanced System Buffering. The Pentium® III processor includes two separate 16 KB level 1 (L1) caches, one for instruction and one for data.
The AMD Athlon processor with performance enhancing cache memory includes an integrated, full-speed, 16-way set-associative, 256KB L2 cache. Previous AMD Athlon processors contain an L2 controller which operates at the maximum frequency compatible with the latest industry-standard SRAMs. By integrating the L2 cache onto the processor, the L2 cache always operates at the same frequency as the processor, thereby minimizing any delays incurred waiting for external data from a slower bus. The newer AMD Athlon processor's L2 cache is 16-way set-associative, twice that of the L2 cache of the Intel Pentium III processor (16-way vs. 8-way). Higher associativity dramatically improves application performance since more local application data resides in the high-speed L2 cache memory instead of system memory. Finally, the integrated L2 cache tags improves performance by quickly indicating whether critical application data is located within the L2 cache. Having integrated tags is especially important for processors which utilize external SRAMs for the L2 cache. If application data is determined not to reside in the L2 cache early enough, then the processor can immediately request this data from the slower system memory, instead of checking for this data in an external L2 cache first, and then, having to request this data from system memory.
The L1 cache provides fast access to the recently used data, increasing the overall performance of the system. Certain versions of the Pentium III processor include a Discrete, off-die level 2 (L2) cache. This L2 cache consists of a 512 KB unified, non-blocking cache that improves performance over cache-on-motherboard solutions by reducing the average memory access time and by providing fast access to recently used instructions and data.