openhub.net
Black Duck Software, Inc.
Black Duck Open Hub
Follow @
OH
Sign In
Join Now
Projects
People
Organizations
Tools
Blog
BDSA
Projects
People
Projects
Organizations
Forums
F
floor
Settings
|
Report Duplicate
0
I Use This!
×
Login Required
Log in to Open Hub
Remember Me
Very Low Activity
Commits
: Listings
Analyzed
about 18 hours
ago. based on code collected
about 21 hours
ago.
Apr 24, 2023 — Apr 24, 2024
Showing page 1 of 60
Search / Filter on:
Commit Message
Contributor
Files Modified
Lines Added
Lines Removed
Code Location
Date
* Metal: updated + extended reflection info dumping: * floor: added toolchain.metal.dump_reflection_info config entry flag to enable this (disabled by default) * metal_program/metal_pipeline: will now query and dump reflection info / bindings when dump_reflection_info is true * metal_program/metal_kernel: moved old reflection handling from metal_kernel to metal_program * metal_program: made the reflection handling compatible with the new MTLBinding system + extended it to handle all function types and their parameters
a2flo
as Florian Ziesche
More...
15 days ago
* metal_kernel/llvm_toolchain: in argument buffers in Metal, array of buffers now also use the BUFFER_ARRAY type and have their size set to the #elements in the array (instead of the physical size in bytes and no array info) -> we no longer need to query reflection data when creating an argument buffer (which was deprecated) and can now compute all of the required info ourselves * floor: more cleanup: removed now unnecessary reload_kernels() + flag, swap() and start_frame(), and remove the "window_swap" parameter from end_frame()
a2flo
as Florian Ziesche
More...
15 days ago
* 14.0 toolchain update: Metal updates: * drop support for Metal 2.x, Metal 3.0 is now the minimum and default target -> removed version checks and obsolete code in various places * removed Metal NVIDIA workarounds (no longer needed, since there is no NVIDIA GPU supporting Metal 3.0) * CGCall/CodeGenModule/MetalFinal: metal kernels now always have 10 parameters (we always have sub-group/SIMD support) * libfloor metadata: treat array of buffers inside argument buffers the same way as in Vulkan -> sets the BUFFER_ARRAY type and size to #elements now instead of the physical size in bytes (which we can still get by multiplying by 8), this makes things easier on the libfloor/host side * MetalTypes: added APPLE_PLATFORM enum (via https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/MachO.def#L123) * MetalLibWriterPass/metallib-dis: make use of new APPLE_PLATFORM enum instead of using magic numbers * TypePrinter: added ComputeKernelDim/ComputeKernelWorkGroupSize support
a2flo
as Florian Ziesche
More...
15 days ago
* spring cleaning 2024 + bump version to 0.4.0-a1: * removed all OpenGL code/support * removed OpenAL/audio support * removing networking (asio/openssl/crypto) * removed CUDA < 12.0 / PTX < 8.0 and sm_3x/Kepler support, CUDA 12.0+ / PTX 8.0+ with Maxwell/sm_50 is now required * drop all Metal 2.x, < iOS 16.0 and < macOS 13.0 support (-> updated/removed various compile-time and run-time version checks) * metal_compute: requires a Metal 3.0 capable GPU (and Mac2 on macOS) now -> various device support flags are now true by default * bump macOS requirement to 13.0 and iOS requirement to 16.0 * SDL3 migration: * built against the 3.1.1 preview right now * updated all the places that query the underlying native system window/display/etc. * use SDL_PLATFORM_* instead of SDL_VIDEO_DRIVER_* defines * use run-time video driver checks/handling in places where this is necessary * vulkan_compute: initial Wayland support (surface extension + surface creation, but still untested) * event: mouse events now mostly contain float2 coords + coord scaling is done in floating point now * event: removed "pressure" on mouse events (we have proper pen events now in SDL -> still need to implement this) * event: misc minor event changes/updates * floor: removed our manual Windows DPI scale/awareness handling, this is implemented in SDL now (SDL_WINDOWS_DPI_AWARENESS hint *before* SDL_Init()) * floor: init more SDL sub-systems by default (timer, joystick, haptic, gamepad, sensor) * floor: use SDL_GetWindowSizeInPixels() to query the actual window size in pixels in get_physical_width/height/screen_size() * floor: get_scale_factor() now calls SDL_GetWindowDisplayScale() on all platforms except macOS/iOS (where we handle this ourselves) * CMake: removed use of SDL3main, this is no longer needed * updated build.sh / Xcode / CMake accordingly * symbol renaming done via rename_symbols.py * upgraded to C++23: * all code compiles in gnu++23/gnu++2b mode now (including device code) * disable all C++23 compat warnings * removed cpp_consteval.hpp and cpp_bitcast.hpp, we can now always use C++ standard functionality for that * bump CMake requirement to CMake 3.20 * note that building against libstdc++ requires GCC 13.0+ now * also set the C target to gnu17 * minor cleanup * clang/LLVM/libc++ 16.0+ are required for compilation now * Xcode 15.0+ is required for compilation now * removed handling of older toolchains * use "#pragma once" instead of manual header guards, excluding device common.hpp (need to include from CLI and as pch) and essentials.hpp (must be able to include more than once) * vulkan_compute: implemented window resizing (renderer reinit) support * vulkan_compute: removed get_swapchain_image_count() / get_swapchain_image_view() * llvm_toolchain: we no longer need NVIDIA workarounds on Metal 3.0+ * use filesystem::remove instead of manual 'rm' system calls * floor: when renderer selection fails (because no toolchain exists), abort right away * floor: removed acquire_context()/release_context() and related OpenGL-only functions * removed obsolete SDL pressure patch * essentials: removed Host-compute "constant" define hack/workaround * events: removed KERNEL_RELOAD/SHADER_RELOAD and kernel_reload_event/shader_reload_event * CMake: reorder include directory order (put all as "AFTER") * build.sh/CMake: silence clang 18 warning about "missing" designated field initializers, since this also triggers on intentionally default-constructed fields (note that for clang 18, we unfortunately need to fully disable all missing field initializer warnings, since only clang 19 added a specific -Wno-missing-designated-field-initializers for this) * ignore -Wswitch-default warnings (conflicts with the other switch warning) * build.sh/CMake/Xcode: added -Wno-nan-infinity-disabled due to fast-math UD * build.sh: preempt libc++ header include path + remove /usr/include include paths, since these can interfere with compiler includes * build.sh: support OpenCL MSYS2/MinGW system packages * build.sh: properly detect macOS on arm64 * build.sh: switch to gnu++2b instead of gnu++23 for compat reasons * build.sh: use dwarf-4 instead of dwarf-2 by default * build.sh: don't target sse4.1 on macOS/iOS * CMake: removed duplicated metal_args.hpp + added missing vulkan_args.hpp * fixed cuda_api compilation (misplaced #endif) * opencl_image: stencil images are no longer supported (this was only possible on shared OpenGL images), throw when this is specified during creation * const_math: remove "const" attribute from pure functions (pure is a superset) * aligned_ptr: added missing <string> include on Windows * enable/set Metal device features/props by default: 32KiB local memory, 1024 max local size, sub-group (shuffle) support, SIMD reduction, 32-bit float atomics, tessellation with 64 factor, image cube functionality, indirect command support, primitive ID support * universal_binary: bump everything to v3 and update the Metal target (removed everything that is no longer optional + added platform target (macOS and iOS right now)) * metal_queue: profiling is now always supported * floor: set vulkan_api_version to 1.3.231 since this is the required version * updated Xcode project (need to set CONFIGURATION_BUILD_DIR and SYMROOT now) * soft_f16: disable native fp16 support on x86 macOS * darwin_helper: enable HDR support on iOS (not tested yet) * more obsolete Metal/macOS/iOS code removal * various Xcode updates
a2flo
as Florian Ziesche
More...
17 days ago
* 14.0 toolchain: disable LTO build on MinGW/MSYS2 since I can't get it to work
a2flo
as Florian Ziesche
More...
27 days ago
* version bump to v0.3.0-f1
a2flo
as Florian Ziesche
More...
27 days ago
* updated README with latest example output for each target + updated example binaries
a2flo
as Florian Ziesche
More...
27 days ago
* vulkan_queue: renamed "experimental_no_blocking" -> "no_blocking", this is no longer experimental * floor_version.hpp: updated VS check for VS2022
a2flo
as Florian Ziesche
More...
27 days ago
* fixed iOS compilation
a2flo
as Florian Ziesche
More...
27 days ago
* 14.0 toolchain: going for release: * added + package licenses of (hopefully) all libraries/code used in the toolchain * enabled LTO build by default (can be disabled by -no-lto) * removed libz3.dll and libgomp-1.dll from Windows toolchain packaging and deployment (no longer needed) * ignore -Wunused-but-set-variable warnings
a2flo
as Florian Ziesche
More...
27 days ago
* 14.0 toolchain update: Vulkan improvements: * SPIRVWriter: Vulkan: added support for translating pointer comparisons via OpPtrDiff (<, <=, >=, >) * OCLToSPIRV: Vulkan: always use acquire-release semantics in atomics when none or sequentially-consistent was specified
a2flo
as Florian Ziesche
More...
28 days ago
* CUDA/Metal/Vulkan: proper memory ordering in atomics: * since hardware has now actually implemented support for this (there are actual functional differences), we need to properly specify this now * always use acquire-release semantics on all atomic operations for now, since this provides the most guarantees and is supported across all backends * in the future, I will probably add more fine-grained control over this * NOTE: Metal doesn't officially support anything but "relaxed" ordering, but the compiler and hardware does support other modes -> use acquire-release semantics with Metal 2.4 onwards * NOTE: toolchain update incoming * compute_algorithm: fixed #elements estimation for scan algorithms for non-sub-group implementations * Metal/device: renamed FLOOR_METAL_MEM_SCOPE_* -> FLOOR_METAL_MEM_FLAGS_*, removed old comment, and added FLOOR_METAL_SYNC_SCOPE_SUB_GROUP and FLOOR_METAL_MEM_FLAGS_OBJECT_DATA to reflect the current Apple naming and functionality
a2flo
as Florian Ziesche
More...
28 days ago
* print more informative error messages when kernel execution fails
a2flo
as Florian Ziesche
More...
29 days ago
* bump build requirements: now requires a clang/LLVM 13.0+ toolchain (or Xcode / CLI tools 13.3) * host_atomic: make use of floating point add/sub and integer min/max atomics * version bump to v0.3.0-b4
a2flo
as Florian Ziesche
More...
29 days ago
* 14.0 toolchain update: CGCall: fixed invalid bitcast by using an address space cast instead
a2flo
as Florian Ziesche
More...
about 1 month ago
* const_math/rt_math/host: make clz(0)/ctz(0) work and return the same everywhere and at compile-time by manually handling 0 and returning the expected values (__builtin_clz/ctz(0) are not considered compile-time constants + return values may differ between x86 and ARM)
a2flo
as Florian Ziesche
More...
about 1 month ago
* 14.0 toolchain update: various improvements/fixes: * removed VulkanUtils.h and moved functions into FloorUtils.h -> updated all users * FloorUtils: split the 32-bit integer simplification from simplify_gep_indices() into separate functions: simplify_integer_to_32bit() that does exactly that and simplify_const_integer_to_32bit() that only does this on constant integers * VulkanPreFinal: implemented lowering of llvm.memcpy instructions into loops (we need to do this since Vulkan/SPIR-V doesn't have a proper memcpy operation) * SPIRVWriter: when translating memcpy for Vulkan/SPIR-V, check if the copy length is larger than 1, abort if so (OpCopyMemory can only copy a single value) * LLVMToSPIRVTransformations: r/vulkan_utils/libfloor_utils/
a2flo
as Florian Ziesche
More...
about 1 month ago
* llvm_toolchain/function_info: clarify that if a local size is set, it is the *required* local size -> renamed + updated all users * llvm_toolchain/function_info: added get_kernel_dim() helper function to query the kernel dimensionality (if the function is a kernel, returns 1 otherwise / by default)
a2flo
as Florian Ziesche
More...
about 1 month ago
* 14.0 toolchain update: various improvements/fixes: * added FloorUtils.h: this currently implements helper functions that iterate over all users (or user instructions) of an llvm::Value in a general way, handling both direct users and single-indirection users of constant expressions * -> use new libfloor_utils::for_all_instruction_users/for_all_users everywhere where we iterate of an instruction or GV (or others) users * AddressSpaceFix: added trivial handling/replacement of llvm.memcpy intrinsics (even if fix_call_instrs is not set / can't be used) * MetalLibWriterPass: fixed incorrect language version (must be 3.1.0) + updated Metal compiler identity when building for Metal 3.1 + reformat * SPIRFinal: erase experimental_noalias_scope_decl LLVM intrinsics * when generating SPIR-V, we now set a "floor.generating_spirv" named metadata for easier detection * added SPIRFinal module pass: this is only run in SPIR mode (not SPIR-V!) to fix up global variables in the wrong address space (all must be constant or local) * SPIRVContainerWriterPass/SPIRVWriter: fixed detection of global variables being used inside a function -> uses new helper function that now also handles usage inside constant expressions * SPIRVInstruction: SPIRVMemoryAccess: added support for scope (make pointer available/visible) * SPIRVWriter: loads/stores of pointers in storage buffer address space are now marked with make pointer available/visible and non-private pointer flags/masks
a2flo
as Florian Ziesche
More...
about 1 month ago
* llvm_toolchain: actually make OpenCL pch compilation work * llvm_toolchain: when printing the SPIR-V validator output, specify which target was used (Vulkan or OpenCL) * const_string: make _cs UDL work on compute backends (need to put the string in constant address space)
a2flo
as Florian Ziesche
More...
about 1 month ago
* 14.0 toolchain update: SPIR-V updates: * Vulkan: updated/ported dxil-spirv CFG structurizer to the latest version (now at d6cff9039956d6f461625b01981c541eb724088c) * this now has initial support for loop/selection control masks (note that this isn't set from the outside yet) * SPIRVWriter: added handling of selection/loop control masks in floor.selection_merge/loop_merge * updated SPIR-V Tools to latest version (now @libfloor_202403 branch based on f20663ca7fec48fdc88e4c4d7c5889f8b4cc5664)
a2flo
as Florian Ziesche
More...
about 1 month ago
* vulkan_args: in debug mode, when checking for the argument type when setting a buffer arg, we need to ignore implicit args + added asserts in places where "is_implicit" is not expected * cuda_buffer/cuda_image: fixed potential nullptr access
a2flo
as Florian Ziesche
More...
about 1 month ago
* floor: added is_initialized() helper function to check if libfloor was already initialized
a2flo
as Florian Ziesche
More...
about 1 month ago
* 14.0 toolchain update: added CUDA 12.4 and PTX 8.4 support
a2flo
as Florian Ziesche
More...
about 1 month ago
* added CUDA 12.4 and PTX 8.4 support
a2flo
as Florian Ziesche
More...
about 1 month ago
* compute_queue: made the current execute_with_handler() / execute_cooperative_with_handler() an overload of execute() / execute_cooperative() instead -> queue.execute(kernel, completion_handler, ...) * compute_queue: added execute_sync() and execute_cooperative_sync() that perform a blocking execution (same as execute_with_parameters() with "wait_until_completion" set to true) * compute_queue: added is_valid_work_size_type() helper function to simplify work_size_type checking * cuda_program: .reqntid does not actually enforce the local size when querying the max-threads-per-block of a function -> do this ourselves now + fail the kernel if the reported max total local size is actually smaller than we expected * cuda_device/metal_device/opencl_device/vulkan_device: set/init minimum expected local memory size (>= 16KiB) * device_info/llvm_toolchain: added dedicated_local_memory() helper function / FLOOR_COMPUTE_INFO_DEDICATED_LOCAL_MEMORY define that are set to the local memory size that a device supports * metal_compute: use public maxThreadgroupMemoryLength instead of private maxComputeThreadgroupMemory to query the local memory size
a2flo
as Florian Ziesche
More...
about 1 month ago
* more get_underlying_metal_buffer_safe() / get_underlying_vulkan_buffer_safe() fixes/replacements
a2flo
as Florian Ziesche
More...
about 2 months ago
* vulkan_args: in debug mode, all per-argument checks will now throw and be caught in set_arguments(), which will then print a more informative error (now including function name and argument index) and return false from set_arguments(), which is the intended error path * vulkan_args: added more argument checks in debug mode (will now test if the argument has the correct type for must variants) * vulkan_args: added nullptr checks to image/buffer array elements + buffer array elements may actually be nullptr now * cuda_buffer/cuda_image: fixed unused attributes in release mode
a2flo
as Florian Ziesche
More...
about 2 months ago
* 14.0 toolchain update: * Vulkan: fixed nullptr check when checking for / handling argument buffers * made all bug report URLs point to the floor_llvm repo
a2flo
as Florian Ziesche
More...
about 2 months ago
* cuda_program: ignore kernels that use too much local/shared memory (we only support static local/shared memory, not dynamic memory, so this is a hard limit for now)
a2flo
as Florian Ziesche
More...
about 2 months ago
←
1
2
3
4
5
6
7
8
9
…
59
60
→
This site uses cookies to give you the best possible experience. By using the site, you consent to our use of cookies. For more information, please see our
Privacy Policy
Agree