现在的位置: 首页 > 综合 > 正文

Tilera Cache Control

2013年08月29日 ⁄ 综合 ⁄ 共 2530字 ⁄ 字号 评论关闭

Support for moving blocks of memory in and out of a core's cache.

The Tile Processor supports both coherent and incoherent memory models. Coherent shared memory provides the shared memory model familiar to most programmers working in pthreads environments - loads and stores behave as if all cores are accessing one global
memory scratchpad. The incoherent memory model allows each core to keep its own copy of a memory location, so that writes to that address might never be visible to other cores.

Working With Coherent Memory

Most parallel algorithms are written to work with coherent shared memory. When writing such algorithms, remember that the Tile Processor implements a relaxed memory model. In order to guarantee that a store operation to a coherent memory address is visible
to other tiles, the core that issued the store instruction must perform a "memory fence". The coherent memory fence operation, provided by

tmc_mem_fence()
, blocks the processor from issuing any other instructions until all previous stores are visible to all other cores.

The memory fence operation is particularly important when implementing shared memory synchronization algorithms. Suppose core A wants to write a data structure to coherent shared memory and then set a flag telling core B that the data is ready to be consumed.
A memory fence is required between the data structure store instructions and the flag store instruction; otherwise the relaxed memory model might allow core B to see the "data is ready" flag while stores to the data structure are still in flight.

In general, we recommend that application developers avoid this kind of low-level shared memory algorithm development. The MDE provides the standard pthreads synchronization mechanisms as well as some TMC extensions. These provided primitives should be adequate
for many applications.

Working with Incoherent Memory

The Tile Processor also allows applications to allocate incoherent memory. Incoherent memory allows each core to keep its own, locally-cached version of a memory address without automatically synchronizing that copy with any other core. Thus, a store by core
A to an incoherent address cannot be guaranteed to be visible to core B unless core A flushes the new value out to DRAM and core B then reloads its copy from DRAM. Working with this memory model presents more of a challenge than using coherent memory.

Incoherent memory accesses are most frequently used when interacting with I/O devices. On TILE64, the I/O shims can only read memory values from DRAM, so applications must flush I/O data to memory before posting it to egress. Similarly, an application must
invalidate any locally cached copies of a memory address before receiving an ingress packet. See the NetIO API Reference (UG212) for more information on working with I/O devices and incoherent memory. The TILEPro I/O shims support direct-to-cache
memory accesses, so applications developed for TILEPro are not required to deal with incoherent memory at all.

抱歉!评论已关闭.