openhub.net
Black Duck Software, Inc.
Black Duck Open Hub
Follow @
OH
Sign In
Join Now
Projects
People
Organizations
Tools
Blog
BDSA
Projects
People
Projects
Organizations
Forums
F
floor_examples
Settings
|
Report Duplicate
0
I Use This!
×
Login Required
Log in to Open Hub
Remember Me
Very Low Activity
Commits
: Listings
Analyzed
about 12 hours
ago. based on code collected
about 18 hours
ago.
Apr 23, 2023 — Apr 23, 2024
Showing page 1 of 16
Search / Filter on:
Commit Message
Contributor
Files Modified
Lines Added
Lines Removed
Code Location
Date
* updated copyright years
a2flo
as Florian Ziesche
More...
23 days ago
* occ: added CMake support
a2flo
as Florian Ziesche
More...
26 days ago
* hlbvh: removed debug output
a2flo
as Florian Ziesche
More...
27 days ago
* hlbvh: more improvements and fixes: * collider: moved all buffer zero init to the start so we only need to sync once * collider: fixed sync in/after "build_aabbs" * collider: get rid of "colliding_triangles" ping/pong buffers, we run this synchronously with rendering, so one buffer is enough * collider: fixed use of temporary variables in radix sort parameters * collider: get rid of some unused kernel parameters * gl_renderer: this now actually works again (+made the rendering identical to the unified renderer) * unified_renderer/gl_renderer: instead of just red triangles (where collisions are detected), draw these triangles with a yellow-to-red gradient (since the collision is stored per-vertex, this gives a better impression of where the collision happened) * unified_renderer: more cleanup on exit * collider/unified_renderer/gl_renderer: will no longer return and pass the detected collisions, we don't actually need/want this -> we're accessing the "colliding_vertices" buffer anyways and rendering should/must only depend on the "triangle_vis" flag * Xcode: link against the release libfloor in release mode (not debug) * general reformatting / minor cleanup
a2flo
as Florian Ziesche
More...
27 days ago
* hlbvh/img: proper resource cleanup on exit * warp: use libwarp_destroy() instead of libwarp_cleanup() on exit * occ: added PTX 8.4 support
a2flo
as Florian Ziesche
More...
about 1 month ago
* hlbvh: no ranges :(
a2flo
as Florian Ziesche
More...
about 1 month ago
* hlbvh: improvements, fixes and general modernization: * replaced the old metal_renderer with a unified_renderer that works with both Metal and Vulkan * unified_renderer/misc: the renderer can be used together with a different compute backend -> the render and compute contexts may differ (e.g. Metal/Vulkan rendering, CUDA/Host-Compute hlbvh computations) * obj_loader: can now set additional COMPUTE_MEMORY_FLAGs when loading an .obj file + OpenGL sharing flag now also checks for Metal/Vulkan sharing flags (won't use OpenGL if any is specified) -> when loading from a compute context, loaded model data can now also be used in a render context that is different when sharing flags are set * set buffer/image debug labels everywhere * animation: ensure triangle count is < 65536 (need to guarantee this now to be able to use 16-bit indices) * animation: make use of new sharing sync functionality/flags for the "colliding_vertices" buffer (written on the compute side, read on the render side) * hlbvh: collide_bvhs() now puts the traversal stack into local memory instead of function scope (register) memory and uses 16-bit instead of 32-bit indices -> will now actually work in Vulkan where we can't use dynamic pointers into function scope arrays (no OpPtrAccessChain on Function storage class) - this may be faster on other backends now as well due to less register pressure * hlbvh: the required local size in collide_bvhs() is now computed via a constexpr function to a) demo that functionality and b) actually make use of that capability to make a more complex computation for it -> this is based on the available local memory size now, which should be known and constant (we use 64 * 2 == 128 bytes per work-item -> on CUDA devices this will compute a local size of 384 due to 48KiB of available local memory, on an Apple GPU this is likely to be 256 work-items due to 32KiB of available local memory) * hlbvh: removed the < sm_50 morton code implementation, the bit op variant should be the fastest on all modern devices * hlbvh: use the specific add reduction/scan algorithms instead of using the non-specific ones with "plus<> {}" * collider: stop flushing the logger in debug mode (this costs a lot of time) * collider: everything is now properly synchronized (either sync/blocking execution or explicit queue finish()) * collider: implemented radix sort using an indirect compute pipeline (used if available, otherwise falls back to the direct approach) -> faster * hlbvh_shaders: removed unnecessary "repl_color" + uniforms data is now placed in an actual buffer * enabled non-blocking execution on Vulkan + disabled resource tracking on Metal (now that everything is properly sync'ed) * added --no-unified option to disable the unified renderer * the current frame time is now set as the window caption * added CMake support * updated copyright year * misc cleanup
a2flo
as Florian Ziesche
More...
about 1 month ago
* minor updates
a2flo
as Florian Ziesche
More...
about 1 month ago
* img: improvements, fixes and general modernization: * replaced the old single-stage blur implementation with a better approach: this now runs with either a 32x32px, 16x16px or 8x8px local size (caching that size + tap count specific overlap, e.g. 46x46px, 30x30px or 22x22px in the default config) and no longer a tap count specific lcoal size -> this is a) a lot faster, and b) actually runs on backends that have limitations on the total local size (e.g. must be a multiple of 32 or must be a power-of-two) * added support for running in float32 mode (default) or float16 mode (via --half startup parameter) * the previous "second cache" is now always active * profiling/timing now uses the compute_queue profiling functionality if available (more accurate timings!) * will now dynamically select the best single-stage blur kernel (based on device support) * make use of compute_queue::execution_parameters_t and compute_queue::execute_with_parameters to properly enforce the "wait until completion" behavior * flipped the OpenGL parameter: must now start with --with-opengl to enable and use OpenGL, otherwise the software rendering is always used * removed now unused options/defines * it is now enforced that the image dim must be a multiple of 32 * added CMake support * updated README description + added example image * misc cleanup * NOTE: with these changes, the single-stage blur actually seems to perform better than the "dumb" blur on most devices
a2flo
as Florian Ziesche
More...
about 2 months ago
* warp: unified renderer overhaul and modernization: * rendering is now multi-threaded: we create/run a thread per parallel/pipelined frame that may be active (right now: 2 frames, but this may be 3 in the future) * the main thread will now mostly only perform event handling and kick off the occasional frame rendering (main thread is throttled by a simple 500µs sleep for now) * set the new NO_RESOURCE_TRACKING and VULKAN_NO_BLOCKING context flags when creating the renderer compute_context (so that the per-frame objects actually have an effect in Metal and Vulkan) * unified_renderer: most of the render state is now allocated+stored per parallel frame (frame_object_t) and then of course only accessed by a single frame / render thread -> prevents any unnecessary synchronization or waiting between frames * unified_renderer: the renderer can now be used together with a different compute backend -> the render and compute contexts may differ now (e.g. Metal/Vulkan rendering, CUDA/Host-Compute warp computations) * unified_renderer: set proper sharing flags when creating the FBO images using the new SHARING_SYNC and SHARING_COMPUTE/RENDER_READ/WRITE flags (we either need to sync FBO images to the compute backend or need to sync the computed warp output to the render backend) * unified_renderer: added support for indirect rendering / indirect command pipelines (+added now required fences), which should a) generally be faster and b) don't require any encoding at run-time * unified_renderer: added support for flushing the renderer (waits until until in-flight frames are done and locks down rendering while active) * unified_renderer: due to the requirements of indirect command pipelines, all uniforms are now stored within a single per-frame uniform buffer, which is updated once per frame and then used by multiple shaders/kernels within that frame * unified_renderer: also need to encode the shadow image and skybox texture in argument buffers now * unified_renderer: added post_init() function that performs various initialization after the initial renderer init (-> will now use this to store a pointer to the model and camera, so that we don't need to specify these every time we want to render something) * unified_renderer: reduced shadow map dim from 16k to 8k (desktop) and 4k to 2k (iOS), we don't really need that much resolution and this takes up a lot of memory * unified_renderer: use VULKAN_HOST_COHERENT for the frame uniforms buffer (this significantly speeds up rendering, since we can directly write to it, instead of always allocating a tmp buffer for this each frame) * unified_renderer: the final frame present is now always blocking -> fixes the situation were a new frame (using the same object) might already be queued again, we don't want this, since it would require additional sync and isn't actually beneficial * unified_renderer: can now flag frame objects to let them rebuild their pipelines at the start of the next frame rendering (note that the renderer will be flushed for this) * unified_renderer: libwarp_camera_setup variable is now part of the renderer (so we don't accidentally pass a new variable (pointer) to libwarp that would lead to unnecessary recompilation) * unified_renderer: the decision whether a frame is a fully rendered frame or a warped frame is now done prior to the point where the frame is "enqueued", this way, we can now actually guarantee proper render/warp frame ordering and generally handle all of the different warp flags/state (note that each frame now also contains additional warp state so it knows what to do) * unified_renderer: use of argument buffers and indirect commands/rendering is now enabled by default * unified_renderer: added debug names/labels to more things * warp_shaders: updated to use array_param<> instead of just array<> for arrays of images * added --always-render option to only perform full frame rendering (instead of rendering + warping) * added --no-tessellation option to disable tessellation even if the device actually supports it * added --no-arg-buffer option to disable the use of argument buffers (also disables tessellation and indirect commands/rendering) * added --no-indrect option to disable the use of indirect commands/rendering * added key pad 0 - 5 key handling: these either display the correct color frame (0) or any of the debug visualizations (1-5) * unified_renderer/gl_renderer: added support for "debug blitting", i.e. can call new libwarp_debug_* (OpenGL) or internal (unified renderer) functions now that visualize the different warp buffers for debugging purposes * gl_renderer: generally simplify blitting * gl_renderer: libwarp_camera_setup variable is now global * camera: since the camera update and camera state query can happen from multiple threads now: store all important camera values in a camera_state_t object that can be updated and accessed safely from multiple threads (note that this is a ring buffer, so we generally shouldn't block other threads when updating) * camera: changed all float/single-precision variables to double/double-precision variables for higher accuracy * camera: moved camera state update into a separate function (this way an external caller can force an update) * auto_cam: force camera state update when running the auto cam now * updated copyright year * updated code according to latest libfloor changes * updated README * NOTE: this is optimized for low latency right now and we can easily get +20% more FPS by encoding/rendering the frames in parallel, but there are still some issues with that (in the future, I'll probably add an option to select between latency and bandwidth optimized rendering) * NOTE: requires libwarp >= v0.2.0 now
a2flo
as Florian Ziesche
More...
about 2 months ago
* nbody: use new sharing flags
a2flo
as Florian Ziesche
More...
about 2 months ago
* more build.sh updates
a2flo
as Florian Ziesche
More...
2 months ago
* one more build script update
a2flo
as Florian Ziesche
More...
2 months ago
* updated all build scripts
a2flo
as Florian Ziesche
More...
2 months ago
* misc updates/improvements: * obj_loader: use span<> variant of create_image() * dnn: added CMake support * dnn: fixed int type cast warnings * dnn: enable Vulkan compilation again (this still throws a validation error, but it works on NVIDIA drivers, will fix this later) * dnn: use span<> variant of create_buffer() * hlbvh: removed unused variables * img: use span<> variant of create_image() * warp: use span<> variant of create_image() * warp: use commit_and_finish() instead of commit() * updated Xcode projects
a2flo
as Florian Ziesche
More...
2 months ago
* nbody: removed Windows tile size workaround (can use 512 here as well now) * updated config: removed exec_model host-compute option * nbody/path_tracer: updated CMakeSettings.json to build with latest VS setup
a2flo
as Florian Ziesche
More...
3 months ago
* occ: added handling of new x86 and ARM CPU tiers/targets
a2flo
as Florian Ziesche
More...
4 months ago
* path_tracer: switched array parameter to array_param
a2flo
as Florian Ziesche
More...
4 months ago
* config: added new host-compute options
a2flo
as Florian Ziesche
More...
4 months ago
* updated README (getting there ...)
a2flo
as Florian Ziesche
More...
4 months ago
* updated README
a2flo
as Florian Ziesche
More...
4 months ago
* updated README
a2flo
as Florian Ziesche
More...
4 months ago
* updated README
a2flo
as Florian Ziesche
More...
4 months ago
* added new path tracer screenshots
a2flo
as Florian Ziesche
More...
4 months ago
* migrated README to .asciidoc
a2flo
as Florian Ziesche
More...
4 months ago
* path tracer improvements and modernization: * implemented some simple texture sampling support (this can be enabled by starting the program with the --with-textures parameter) * added some simple textures in data/textures/ * improved random value computation (use better multiplier, use better seed computation, can use bit_cast<float> now) * uses execution_parameters_t and execute_with_parameters() now * can now reset everything by pressing 'R' * misc other code modernization * updated build.sh, CMake and Xcode project * now links against SDL2_image
a2flo
as Florian Ziesche
More...
4 months ago
* nbody/warp: updated shaders (can use in.position now instead of frag_coord workaround in Vulkan)
a2flo
as Florian Ziesche
More...
4 months ago
* occ: added PTX 8.3 support
a2flo
as Florian Ziesche
More...
5 months ago
* warp: also updated the quaternion handling here
a2flo
as Florian Ziesche
More...
6 months ago
* nbody: updated quaternion-based rotation handling + fixed warning
a2flo
as Florian Ziesche
More...
6 months ago
←
1
2
3
4
5
6
7
8
9
…
15
16
→
This site uses cookies to give you the best possible experience. By using the site, you consent to our use of cookies. For more information, please see our
Privacy Policy
Agree