0
I Use This!
Very Low Activity

Commits : Listings

Analyzed about 3 hours ago. based on code collected about 14 hours ago.
May 05, 2023 — May 05, 2024
Commit Message Contributor Files Modified Lines Added Lines Removed Code Location Date
* updated metal renderers to query and use the metal view pixel format instead of always using BGRA8Unorm * nbody: Metal: body textures are now computed and stored as RGBA16F * config: added screen.wide_gamut setting (default to false)
a2flo
as Florian Ziesche
More... almost 5 years ago
* update hlbvh as well
a2flo
as Florian Ziesche
More... almost 5 years ago
* update all the things: * nbody: build script: removed old macOS X11 paths, added /Library/Frameworks as framework path * nbody: added iPad A12 benchmark result * nbody: updated link to nbody in GPU Gems 3 * nbody: vulkan_renderer: removed retained_buffers (handled properly now) * occ: added fubar Metal 2.2 targets, added 2.2 metal-std * occ: updated metal_device usage according to latest floor changes * warp: gl_renderer: fixed warning about u8 raw strings * warp: vulkan_renderer: removed retained_buffers (handled properly now) * dnn: fixed iOS build (missing include after recent cleanup) * config: added soft_printf property in vulkan (default to false) * common obj_loader: use const compute_context& instead of shared_ptr<compute_context>
a2flo
as Florian Ziesche
More... almost 5 years ago
* nbody: vulkan_renderer: use vulkan_kernel::multi_draw instead of manually doing this (also takes care of IUBs now) * nbody: updated R9 285 benchmark result (on AMD/RADV vulkan drivers)
a2flo
as Florian Ziesche
More... almost 5 years ago
* occ: updated according to latest floor changes (updated target struct usage) * nbody: vulkan_renderer: don't use dynamic offsets now that we don't use dynamic SSBOs any more
a2flo
as Florian Ziesche
More... almost 5 years ago
* nbody: build script: also added support for building with AMDs new OpenCL SDK
a2flo
as Florian Ziesche
More... almost 5 years ago
* nbody: updated build script for msys/mingw
a2flo
as Florian Ziesche
More... almost 5 years ago
* nbody: vulkan_renderer: properly transition both body textures to "read" after creation * nbody: VS/CMake: set FLOOR_DEBUG define in debug mode
a2flo
as Florian Ziesche
More... almost 5 years ago
* occ: * added --vulkan-std flag (1.0 or 1.1) + handle Vulkan and SPIR-V version accordingly * fubar: added Vulkan 1.1 / SPIR-V 1.3 target
a2flo
as Florian Ziesche
More... almost 5 years ago
* nbody: added build support for VS2019/CMake/vcpkg/clang + removed VS projects * nbody: fixed missing header includes, misc other fixes
a2flo
as Florian Ziesche
More... almost 5 years ago
* dnn: updated iOS code
a2flo
as Florian Ziesche
More... about 5 years ago
* occ: updated OpenCL/Vulkan code
a2flo
as Florian Ziesche
More... about 5 years ago
* missed the warp vulkan_renderer updates
a2flo
as Florian Ziesche
More... about 5 years ago
* updated all projects according to latest libfloor changes * updated config: Metal section now sets "dis" to "metallib-dis" (provided by the toolchain), added "soft_printf" (defaults to false)
a2flo
as Florian Ziesche
More... about 5 years ago
* updated all projects according to latest libfloor changes + update copyright years while we're at it * misc minor cleanup
a2flo
as Florian Ziesche
More... about 5 years ago
* nbody: fixed compilation when vulkan is enabled
a2flo
as Florian Ziesche
More... over 5 years ago
* dnn: don't compile kernels that require a local size of 1024 for devices that don't support it + properly deal with it in nn_executer * nbody: updated A10 benchmark results (4 more gflops on iOS 12 *yay*)
a2flo
as Florian Ziesche
More... over 5 years ago
* dnn: enable sub-group sum reduction everywhere (as long as we have a fixed known SIMD-width)
a2flo
as Florian Ziesche
More... over 5 years ago
* be more specific on the DNN screenshot description
a2flo
as Florian Ziesche
More... over 5 years ago
* removed old DNN screenshots + make new one have a black background
a2flo
as Florian Ziesche
More... over 5 years ago
* put DNN iOS and cli example screenshots into one file
a2flo
as Florian Ziesche
More... over 5 years ago
* updated README with dnn example info and screenshots * added VGG16 net download script/readme + tiger test image (licensed under CC BY 2.0 (https://creativecommons.org/licenses/by/2.0/deed.en) by Tony Hisgett (https://www.flickr.com/people/37804979@N00))
a2flo
as Florian Ziesche
More... over 5 years ago
* added DNN example to demonstrate inference on a VGG16 deep neural network: * this implements convolution, max-pooling, fully-connected and softmax layers needed for VGG * this provides both a desktop/cli implementation (can provide an RGBA PNG image via --image <filename>) and an iOS implementation that runs VGG16 on a camera image * a semi-FP16 version of the net can be run with the --fp16 parameter (fully-connected layers are stored as FP16, rest remains FP32), this is enabled by default on iOS (due to 256 MiB buffer size restriction) and disabled by default everywhere else * a benchmark mode will be added in the future, right now, VGG is run once on the input image in the desktop/cli version or can run multiple times in the iOS version * NOTE: VGG16 model has been adapted from tensorflow-vgg (https://github.com/machrisaa/tensorflow-vgg), which has been adapted from tensorflow-vgg16 (https://github.com/ry/tensorflow-vgg16), which in turn has been adapted from the original VGG16 released under CC BY 4.0 (https://gist.github.com/ksimonyan/211839e770f7b538e2d8#file-readme-md and http://www.robots.ox.ac.uk/~vgg/research/very_deep/) * NOTE: tested/runs on CUDA, Metal (macOS with NVIDIA/AMD GPU, Intel GPU might not work; iOS A10+), OpenCL (Intel CPU) * NOTE: host-compute and Vulkan are explicitly not supported right now (host-compute runs out of stack, Vulkan can't deal with the cfg madness)
a2flo
as Florian Ziesche
More... over 5 years ago
* nbody: guarded initialization to properly unlock the context mutex if init fails/terminates * obj_loader: updated copyright year * updated xcode projects
a2flo
as Florian Ziesche
More... over 5 years ago
* nbody: set nbody_compute kernel local/work-group size to NBODY_TILE_SIZE (potentially enabling better optimization) + updated RX 590 benchmark result
a2flo
as Florian Ziesche
More... over 5 years ago
* nbody: added RX 590 benchmark info + updated A10 stats + removed iOS default body count reduction (modern h/w can easily deal with 32k)
a2flo
as Florian Ziesche
More... over 5 years ago
* occ: added --cuda-no-short-ptr flag to disable short pointers in CUDA/PTX with the new 8.0 toolchain * nbody: added benchmark info for AMD RX 580 (achieved on macOS 10.14.2 with Metal)
a2flo
as Florian Ziesche
More... over 5 years ago
* nbody: updated i9-7980XE benchmark result tested with latest OpenCL CPU runtimes + better formatting
a2flo
as Florian Ziesche
More... over 5 years ago
* nbody: added benchmark info of RTX 2080 Ti * reduction: use max_coop_total_local_size for the coop kernel launch instead of the static 2048 (fixes Turing/sm_75 execution) + use 1 or 2 GiB buffers again (except when running on iOS) * misc updates
a2flo
as Florian Ziesche
More... over 5 years ago
* occ: --metal-std option now allows specifying 2.1 * nbody: always use universal binary on iOS; fixup buffer flags; don't use full unroll on AMD; use 16-wide loop unroll with host-compute * path_tracer: removed pre-C++17 code; use universal binary on iOS * reduction: use 256 MiB buffer by default (instead of 1GiB), added proper buffer/mapping flags * updated misc Xcode projects (latest settings, etc.) * config: added metal -> force_version field to for the toolchain to generate Metal code for a specific version
a2flo
as Florian Ziesche
More... over 5 years ago