6 GB memory
bandwidth 150 GB/s

SM Streaming multiprocessors


32 CUDA cores in Fermi.
Warp=32 threads, vector of that size.
2 warp schedulers..

GPU allows memory assignment into shared and L1 ...

Kepler wil allow dynamic parallelism.


pgcc -acc -ta=nvidia -Minfo=accel


Kernel operates on:
Grid Blocks threads

Each thread operates on a core.