An ASIC developed by Google, optimized for large-scale matrix multiplication processing and energy efficiency.
- Systolic Array + Pipelining to minimize memory access
- Ahead-of-Time compilation (XLA) to predetermine memory access patterns, utilizing scratchpads instead of caches
TPU Versions