1
I Use This!
Moderate Activity

Commits : Listings

Analyzed about 5 hours ago. based on code collected about 6 hours ago.
Aug 29, 2024 — Aug 29, 2025
Commit Message Contributor Files Modified Lines Added Lines Removed Code Location Date
workaround CUDA 13 API changes More... 1 day ago
add nvidia target for convenience More... 1 day ago
reorg nvshmem build stuff More... 1 day ago
cursor found and fixed this bug More... 4 days ago
GPU MODE slides More... 10 days ago
ignore More... 16 days ago
fix compiler errors More... 16 days ago
user buffer in nccl More... 16 days ago
untested sketch of put version w nvshmem More... 17 days ago
move the barrier - it's a 0.1% difference More... 17 days ago
fix duplication nvshmem get More... 18 days ago
simple explanation of transpose.py More... 18 days ago
plot transpose idea More... 18 days ago
update profiling More... 18 days ago
fix profiling output More... 19 days ago
nvshmem profiling More... 19 days ago
CUDA events More... 19 days ago
add cuda events More... 19 days ago
implement perftest (no accumulate) in transpose cuda and cublas More... 23 days ago
change BW measurement to use 4x bytes not 2x More... 23 days ago
disable ranges since dependencies are annoying More... 23 days ago
perf test measurement More... 24 days ago
remove bulk variant since not applicable More... 24 days ago
homogenize source More... 25 days ago
transpose nvshmem get on device non-naive variants More... 29 days ago
on device get version for naive More... 29 days ago
grid stride params good now More... 4 months ago
fix block strided issues More... 4 months ago
header More... 4 months ago
block strided transpose More... 4 months ago