removes allocated memory
- CUDA OOMs: Count of out-of-memory errors on the GPU.
- cudaMalloc retries: Number of times memory allocation was retried before succeeding.
- Allocated memory:
- Cur Usage: Currently allocated bytes by tensors.
- Peak Usage: Maximum allocated bytes recorded.
- Tot Alloc / Tot Freed: Total bytes allocated/freed over time.
- from large/small pool: Breakdown for large vs. small block allocations.
- Active memory: Memory currently held by active tensors (similar stats as Allocated memory).
- Requested memory: Memory actually requested from the allocator (may differ slightly due to alignment or overhead).
- GPU reserved memory: Total GPU memory reserved by PyTorch’s caching allocator (includes both used and cached/free blocks).
- Non-releasable memory: Memory that, due to fragmentation or caching, cannot be returned to the GPU immediately.
- Allocations: Count of all allocation events (total and split into large/small pool allocations).
- Active allocs: Count of currently active allocation events.
- GPU reserved segments: Number of contiguous memory chunks reserved from the GPU (split into large and small segments).
- Non-releasable allocs: Count of allocations that remain in memory and cannot be freed immediately.
- Oversize allocations / Oversize GPU segments: Allocations (and their corresponding memory chunks) that were too large to fit in the regular caching pools and thus were allocated separately.
Memory snapshot
- Active Memory Timeline
- Allocator State History
- Active Cached Segment Timeline
- Allocator Settings