I Use This!
Very High Activity

News

Analyzed 1 day ago. based on code collected about 1 month ago.
Posted about 6 years ago
Back in 2014, I posted Moving Oracle Solaris to LP64 bit by bit describing work we were doing then. In 2015, I provided an update covering Oracle Solaris 11.3 progress on LP64 conversion. Now that we've released the Oracle Solaris 11.4 Beta to the ... [More] public you can see the ratio of ILP32 to LP64 programs in /usr/bin and /usr/sbin in the full Oracle Solaris package repositories has dramatically shifted in 11.4: Release 32-bit 64-bit total Solaris 11.0 1707 (92%) 144 (8%) 1851 Solaris 11.1 1723 (92%) 150 (8%) 1873 Solaris 11.2 1652 (86%) 271 (14%) 1923 Solaris 11.3 1603 (80%) 379 (19%) 1982 Solaris 11.4 169 (9%) 1769 (91%) 1938 That's over 70% more of the commands shipped in the OS which can use ADI to stop buffer overflows on SPARC, take advantage of more registers on x86, have more address space available for ASLR to choose from, are ready for timestamps and dates past 2038, and receive the other benefits of 64-bit software as described in previous blogs. And while we continue to provide more features for 64-bit programs, such as making ADI support available in the libc malloc, we aren't abandoning 32-bit programs either. A change that just missed our first beta release, but is coming in a later refresh of our public beta will make it easier for 32-bit programs to use file descriptors > 255 with stdio calls, relaxing a long held limitation of the 32-bit Solaris ABI. This work was years in the making, and over 180 engineers contributed to it in the Solaris organization, plus even more who came before to make all the FOSS projects we ship and the libraries we provide be 64-bit ready so we could make this happen. We thank all of them for making it possible to bring this to you now. [Less]
Posted about 6 years ago
It was only a few weeks ago when I posted that the Intel Mesa driver had successfully passed the Khronos OpenGL 4.6 conformance on day one, and now I am very proud that we can announce the same for the Intel Mesa Vulkan 1.1 driver, the new Vulkan API ... [More] version announced by the Khronos Group last week. Big thanks to Intel for making Linux a first-class citizen for graphics APIs, and specially to Jason Ekstrand, who did most of the Vulkan 1.1 enablement in the driver. At Igalia we are very proud of being a part of this: on the driver side, we have contributed the implementation of VK_KHR_16bit_storage, numerous bugfixes for issues raised by the Khronos Conformance Test Suite (CTS) and code reviews for some of the new Vulkan 1.1 features developed by Intel. On the CTS side, we have worked with other Khronos members in reviewing and testing additions to the test suite, identifying and providing fixes for issues in the tests as well as developing new tests. Finally, I’d like to highlight the strong industry adoption of Vulkan: as stated in the Khronos press release, various other hardware vendors have already implemented conformant Vulkan 1.1 drivers, we are also seeing major 3D engines adopting and supporting Vulkan and AAA games that have already shipped with Vulkan-powered graphics. There is no doubt that this is only the beginning and that we will be seeing a lot more of Vulkan in the coming years, so look forward to it! Vulkan and the Vulkan logo are registered trademarks of the Khronos Group Inc. [Less]
Posted about 6 years ago
This week I wrote a little patch series to get VCHI probing on upstream Raspberry Pi. As we’re building a more normal media stack for the platform, I want to get this upstreamed, and VCHI is at the root of the firmware services for media. Next step ... [More] for VCHI upstreaming will be to extract Dave Stevenson’s new VCSM driver and upstream it, which as I understand it lets you do media decode stuff without gpu_mem= settings in the firmware – the firmware will now request memory from Linux, instead of needing a fixed carveout. That driver will also be part of the dma-buf plan for the new v4l2 mem2mem driver he’s been working. Dave Stevenson has managed to produce a V4L2 mem2mem driver doing video decode/encode. He says it’s still got some bugs, but things look really promising. In VC4 display, Stefan Schake submitted patches for fixing display plane alpha blending in the DRM hwcomposer for Android, and I’ve merged them to drm-misc-next. I also rebased my out-of-tree DPI patch, fixed the regression from last year, and submitted patches upstream and downstream (including a downstream overlay). Hopefully this can help other people attach panels to Raspberry Pi. On the 3D side, I’ve pushed the YUV-import accelerated blit code. We should now be able to display dma-bufs fast in Kodi, whether you’ve got KMS planes or the fallback GL composition. Also, now that the kernel side has made it to drm-next, I’ve pushed Boris’s patches for vc4 perfmon into Mesa. Now you can use commands like: apitrace replay application.trace --pdraw=GL_AMD_performance_monitor:QPU-total-clk-cycles-vertex-coord-shading to examine behavior of your GL applications on the HW. Note that each doing –pdraw level tracing (instead of –pframes) means that each draw call will flush the scene, which is incredibly expensive in terms of memory bandwidth. [Less]
Posted about 6 years ago
A recording of the talk will be available here later. Downloads If you're curious about the slides, you can download the PDF or the OTP. Thanks This post has been a part of work undertaken by my employer Collabora. I would like to thank the wonderful organizers of Embedded Linux Conference NA, for hosting a great event.
Posted about 6 years ago
A recording of the talk is available . Downloads If you're curious about the slides, you can download the PDF or the OTP. Thanks This post has been a part of work undertaken by my employer Collabora. I would like to thank the wonderful organizers of Embedded Linux Conference NA, for hosting a great event.
Posted about 6 years ago
A recording of the talk is available here. Downloads If you're curious about the slides, you can download the PDF or the OTP. Thanks This post has been a part of work undertaken by my employer Collabora. I would like to thank the wonderful organizers of Embedded Linux Conference NA, for hosting a great event.
Posted about 6 years ago
This is the first entry in an on-going series. Here's a list of all entries: What has TableGen ever done for us? Functional Programming Bits Resolving variables DAGs to be continued Anybody who has ever done serious backend work in LLVM has ... [More] probably developed a love-hate relationship with TableGen. At its best it can be an extremely useful tool that saves a lot of manual work. At its worst, it will drive you mad with bizarre crashes, indecipherable error messages, and generally inscrutable failures to understand what you want from it.TableGen is an internal tool of the LLVM compiler framework. It implements a domain-specific language that is used to describe many different kinds of structures. These descriptions are translated to read-only data tables that are used by LLVM during compilation.For example, all of LLVM's intrinsics are described in TableGen files. Additionally, each backend describes its target machine's instructions, register file(s), and more in TableGen files.The unit of description is the record. At its core, a record is a dictionary of key-value pairs. Additionally, records are typed by their superclass(es), and each record can have a name. So for example, the target machine descriptions typically contain one record for each supported instruction. The name of this record is the name of the enum value which is used to refer to the instruction. A specialized backend in the TableGen tool collects all records that subclass the Instruction class and generates instruction information tables that is used by the C++ code in the backend and the shared codegen infrastructure.The main point of the TableGen DSL is to provide an ostensibly convenient way to generate a large set of records in a structured fashion that exploits regularities in the target machine architecture. To get an idea of the scope, the X86 backend description contains ~47k records generated by ~62k lines of TableGen. The AMDGPU backend description contains ~39k records generated by ~24k lines of TableGen.To get an idea of what TableGen looks like, consider this simple example: def Plain {  int x = 5;}class Room {  string Name = name;  string WallColor = "white";}def lobby : Room<"Lobby">;multiclass Floor {  let WallColor = color in {    def _left : Room;    def _right : Room;  }}defm first_floor : Floor<1, "yellow">;defm second_floor : Floor<2, "gray">; This example defines 6 records in total. If you have an LLVM build around, just run the above through llvm-tblgen to see them for yourself. The first one has name Plain and contains a single value named x of value 5. The other 5 records have Room as a superclass and contain different values for Name and WallColor.The first of those is the record of name lobby, whose Name value is "Lobby" (note the difference in capitalization) and whose WallColor is "white".Then there are four records with the names first_floor_left, first_floor_right, second_floor_left, and second_floor_right. Each of those has Room as a superclass, but not Floor. Floor is a multiclass, and multiclasses are not classes (go figure!). Instead, they are simply collections of record prototypes. In this case, Floor has two record prototypes, _left and _right. They are instantiated by each of the defm directives. Note how even though def and defm look quite similar, they are conceptually different: one instantiates the prototypes in a multiclass (or several multiclasses), the other creates a record that may or may not have one or more superclasses.The Name value of first_floor_left is "1_left" and its WallColor is "yellow", overriding the default. This demonstrates the late-binding nature of TableGen, which is quite useful for modeling exceptions to an otherwise regular structure: class Foo {  string salutation = "Hi";  string message = salutation#", world!";}def : Foo {  let salutation = "Hello";} The message of the anonymous record defined by the def-statement is "Hello, world!".There is much more to TableGen. For example, a particularly surprising but extremely useful feature are the bit sets that are used to describe instruction encodings. But that's for another time.For now, let me leave you with just one of the many ridiculous inconsistencies in TableGen:class Tag {  int Number = num;}class Test {  int Number1 = Tag<5>.Number;  int Number2 = Tag.Number;  Tag Tag1 = Tag<5>;  Tag Tag2 = Tag;}def : Test<5>;What are the values in the anonymous record? It turns out that Number1 and Number2 are both 5, but Tag1 and Tag2 refer to different records. Tag1 refers to an anonymous record with superclass Tag and Number equal to 5, while Tag2 also refers to an anonymous record, but with the Number equal to an unresolved variable reference.This clearly doesn't make sense at all and is the kind of thing that sometimes makes you want to just throw it all out of the window and build your own DSL with blackjack and Python hooks. The problem with that kind of approach is that even if the new thing looks nicer initially, it'd probably end up in a similarly messy state after another five years.So when I ran into several problems like the above recently, I decided to take a deep dive into the internals of TableGen with the hope of just fixing a lot of the mess without reinventing the wheel. Over the next weeks, I plan to write a couple of focused entries on what I've learned and changed, starting with how a simple form of functional programming should be possible in TableGen. [Less]
Posted about 6 years ago
This is the fifth part of a series; see the first part for a table of contents.With bit sequences, we have already seen one unusual feature of TableGen that is geared towards its specific purpose. DAG nodes are another; they look a bit like ... [More] S-expressions:def op1;def op2;def i32:def Example {  dag x = (op1 $foo, (op2 i32:$bar, "Hi"));}In the example, there are two DAG nodes, represented by a DagInit object in the code. The first node has as its operation the record op1. The operation of a DAG node must be a record, but there are no other restrictions. This node has two children or arguments: the first argument is named foo but has no value. The second argument has no name, but it does have another DAG node as its value.This second DAG node has the operation op2 and two arguments. The first argument is named bar and has value i32, the second has no name and value "Hi".DAG nodes can have any number of arguments, and they can be nested arbitrarily. The values of arguments can have any type, at least as far as the TableGen frontend is concerned. So DAGs are an extremely free-form way of representing data, and they are really only given meaning by TableGen backends.There are three main uses of DAGs: Describing the operands on machine instructions. Describing patterns for instruction selection. Describing register files with something called "set theory". I have not yet had the opportunity to explore the last point in detail, so I will only give an overview of the first two uses here.Describing the operands of machine instructions is fairly straightforward at its core, but the details can become quite elaborate.I will illustrate some of this with the example of the V_ADD_F32 instruction from the AMDGPU backend. V_ADD_F32 is a standard 32-bit floating point addition, at least in its 32-bit-encoded variant, which the backend represents as V_ADD_F32_e32.Let's take a look at some of the fully resolved records produced by the TableGen frontend: def V_ADD_F32_e32 {    // Instruction AMDGPUInst ...  dag OutOperandList = (outs anonymous_503:$vdst);  dag InOperandList = (ins VSrc_f32:$src0, VGPR_32:$src1);  string AsmOperands = "$vdst, $src0, $src1";  ...}def anonymous_503 {    // DAGOperand RegisterOperand VOPDstOperand  RegisterClass RegClass = VGPR_32;  string PrintMethod = "printVOPDst";  ...} As you'd expect, there is one out operand. It is named vdst and an anonymous record is used to describe more detailed information such as its register class (a 32-bit general purpose vector register) and the name of a special method for printing the operand in textual assembly output. (The string "printVOPDst" will be used by the backend that generates the bulk of the instruction printer code, and refers to the method AMDGPUInstPrinter::printVOPDst that is implemented manually.)There are two in operands. src1 is a 32-bit general purpose vector register and requires no special handling, but src0 supports more complex operands as described in the record VSrc_f32 elsewhere.Also note the string AsmOperands, which is used as a template for the automatically generated instruction printer code. The operand names in that string refer to the names of the operands as defined in the DAG nodes.This was a nice warmup, but didn't really demonstrate the full power and flexibility of DAG nodes. Let's look at V_ADD_F32_e64, the 64-bit encoded version, which has some additional features: the sign bits of the inputs can be reset or inverted, and the result (output) can be clamped and/or scaled by some fixed constants (0.5, 2, and 4). This will seem familiar to anybody who has worked with the old OpenGL assembly program extensions or with DirectX shader assembly.The fully resolved records produced by the TableGen frontend are quite a bit more involved: def V_ADD_F32_e64 {    // Instruction AMDGPUInst ...  dag OutOperandList = (outs anonymous_503:$vdst);  dag InOperandList =    (ins FP32InputMods:$src0_modifiers, VCSrc_f32:$src0,         FP32InputMods:$src1_modifiers, VCSrc_f32:$src1,         clampmod:$clamp, omod:$omod);  string AsmOperands = "$vdst, $src0_modifiers, $src1_modifiers$clamp$omod";  list Pattern =    [(set f32:$vdst, (fadd      (f32 (VOP3Mods0 f32:$src0, i32:$src0_modifiers,                      i1:$clamp, i32:$omod)),      (f32 (VOP3Mods f32:$src1, i32:$src1_modifiers))))];  ...}def FP32InputMods {     // DAGOperand Operand InputMods FPInputMods  ValueType Type = i32;  string PrintMethod = "printOperandAndFPInputMods";  AsmOperandClass ParserMatchClass = FP32InputModsMatchClass;  ...}def FP32InputModsMatchClass {   // AsmOperandClass FPInputModsMatchClass  string Name = "RegOrImmWithFP32InputMods";  string PredicateMethod = "isRegOrImmWithFP32InputMods";  string ParserMethod = "parseRegOrImmWithFPInputMods";  ...} The out operand hasn't changed, but there are now many more special in operands that describe whether those additional features of the instruction are used.You can again see how records such as FP32InputMods refer to manually implemented methods. Also note that the AsmOperands string no longer refers to src0 or src1. Instead, the printOperandAndFPInputMods method on src0_modifiers and src1_modifiers will print the source operand together with its sign modifiers. Similarly, the special ParserMethod parseRegOrImmWithFPInputMods will be used by the assembly parser.This kind of extensibility by combining generic automatically generated code with manually implemented methods is used throughout the TableGen backends for code generation.Something else is new here: the Pattern. This pattern, together will all the other patterns defined elsewhere, is compiled into a giant domain-specific bytecode that executes during instruction selection to turn the SelectionDAG into machine instructions. Let's take this particular pattern apart:(set f32:$vdst, (fadd ...))We will match an fadd selection DAG node that outputs a 32-bit floating point value, and this output will be linked to the out operand vdst. (set, fadd and many others are defined in the target-independent include/llvm/Target/TargetSelectionDAG.td.)(fadd (f32 (VOP3Mods0 f32:$src0, i32:$src0_modifiers,                      i1:$clamp, i32:$omod)),      (f32 (VOP3Mods f32:$src1, i32:$src1_modifiers)))Both input operands of the fadd node must be 32-bit floating point values, and they will be handled by complex patterns. Here's one of them:def VOP3Mods { // ComplexPattern  string SelectFunc = "SelectVOP3Mods";  int NumOperands = 2;  ...} As you'd expect, there's a manually implemented SelectVOP3Mods method. Its signature isbool SelectVOP3Mods(SDValue In, SDValue &Src,                    SDValue &SrcMods) const;It can reject the match by returning false, otherwise it pattern matches a single input SelectionDAG node into nodes that will be placed into src1 and src1_modifiers in the particular pattern we were studying.Patterns can be arbitrarily complex, and they can be defined outside of instructions as well. For example, here's a pattern for generating the S_BFM_B32 instruction, which generates a bitfield mask: def anonymous_2373anonymous_2371 {    // Pattern Pat ...  dag PatternToMatch =    (i32 (shl (i32 (add (i32 (shl 1, i32:$a)), -1)), i32:$b));  list ResultInstrs = [(S_BFM_B32 ?:$a, ?:$b)];  ...}The name of this record doesn't matter. The instruction selection TableGen backend simply looks for all records that have Pattern as a superclass. In this case, we match an expression of the form ((1 << a) - 1) << b on 32-bit integers into a single machine instruction.So far, we've mostly looked at how DAGs are interpreted by some of the key backends of TableGen. As it turns out, most backends generate their DAGs in a fairly static way, but there are some fancier techniques that can be used as well. This post is already quite long though, so we'll look at those in the next post. [Less]
Posted about 6 years ago
Vulkan 1.1 was officially released today, and thanks to a big effort by Bas and a lot of shared work from the Intel anv developers, radv is a launch day conformant implementation. ... [More] https://www.khronos.org/conformance/adopters/conformant-products#submission_308is a link to the conformance results. This is also radv's first time to be officially conformant on Vega GPUs. https://patchwork.freedesktop.org/series/39535/is the patch series, it requires a bunch of common anv patches to land first. This stuff should all be landing in Mesa shortly or most likely already will have by the time you read this.In order to advertise 1.1 you need at least a 4.15 Linux kernel.Thanks to the all involved in making this happen, including the behind the scenes effort to allow radv to participate in the launch day! [Less]
Posted about 6 years ago
This is the first entry in an on-going series. Here's a list of all entries: What has TableGen ever done for us? Functional Programming Bits Resolving variables to be continued Anybody who has ever done serious backend work in LLVM has probably ... [More] developed a love-hate relationship with TableGen. At its best it can be an extremely useful tool that saves a lot of manual work. At its worst, it will drive you mad with bizarre crashes, indecipherable error messages, and generally inscrutable failures to understand what you want from it.TableGen is an internal tool of the LLVM compiler framework. It implements a domain-specific language that is used to describe many different kinds of structures. These descriptions are translated to read-only data tables that are used by LLVM during compilation.For example, all of LLVM's intrinsics are described in TableGen files. Additionally, each backend describes its target machine's instructions, register file(s), and more in TableGen files.The unit of description is the record. At its core, a record is a dictionary of key-value pairs. Additionally, records are typed by their superclass(es), and each record can have a name. So for example, the target machine descriptions typically contain one record for each supported instruction. The name of this record is the name of the enum value which is used to refer to the instruction. A specialized backend in the TableGen tool collects all records that subclass the Instruction class and generates instruction information tables that is used by the C++ code in the backend and the shared codegen infrastructure.The main point of the TableGen DSL is to provide an ostensibly convenient way to generate a large set of records in a structured fashion that exploits regularities in the target machine architecture. To get an idea of the scope, the X86 backend description contains ~47k records generated by ~62k lines of TableGen. The AMDGPU backend description contains ~39k records generated by ~24k lines of TableGen.To get an idea of what TableGen looks like, consider this simple example: def Plain {  int x = 5;}class Room {  string Name = name;  string WallColor = "white";}def lobby : Room<"Lobby">;multiclass Floor {  let WallColor = color in {    def _left : Room;    def _right : Room;  }}defm first_floor : Floor<1, "yellow">;defm second_floor : Floor<2, "gray">; This example defines 6 records in total. If you have an LLVM build around, just run the above through llvm-tblgen to see them for yourself. The first one has name Plain and contains a single value named x of value 5. The other 5 records have Room as a superclass and contain different values for Name and WallColor.The first of those is the record of name lobby, whose Name value is "Lobby" (note the difference in capitalization) and whose WallColor is "white".Then there are four records with the names first_floor_left, first_floor_right, second_floor_left, and second_floor_right. Each of those has Room as a superclass, but not Floor. Floor is a multiclass, and multiclasses are not classes (go figure!). Instead, they are simply collections of record prototypes. In this case, Floor has two record prototypes, _left and _right. They are instantiated by each of the defm directives. Note how even though def and defm look quite similar, they are conceptually different: one instantiates the prototypes in a multiclass (or several multiclasses), the other creates a record that may or may not have one or more superclasses.The Name value of first_floor_left is "1_left" and its WallColor is "yellow", overriding the default. This demonstrates the late-binding nature of TableGen, which is quite useful for modeling exceptions to an otherwise regular structure: class Foo {  string salutation = "Hi";  string message = salutation#", world!";}def : Foo {  let salutation = "Hello";} The message of the anonymous record defined by the def-statement is "Hello, world!".There is much more to TableGen. For example, a particularly surprising but extremely useful feature are the bit sets that are used to describe instruction encodings. But that's for another time.For now, let me leave you with just one of the many ridiculous inconsistencies in TableGen:class Tag {  int Number = num;}class Test {  int Number1 = Tag<5>.Number;  int Number2 = Tag.Number;  Tag Tag1 = Tag<5>;  Tag Tag2 = Tag;}def : Test<5>;What are the values in the anonymous record? It turns out that Number1 and Number2 are both 5, but Tag1 and Tag2 refer to different records. Tag1 refers to an anonymous record with superclass Tag and Number equal to 5, while Tag2 also refers to an anonymous record, but with the Number equal to an unresolved variable reference.This clearly doesn't make sense at all and is the kind of thing that sometimes makes you want to just throw it all out of the window and build your own DSL with blackjack and Python hooks. The problem with that kind of approach is that even if the new thing looks nicer initially, it'd probably end up in a similarly messy state after another five years.So when I ran into several problems like the above recently, I decided to take a deep dive into the internals of TableGen with the hope of just fixing a lot of the mess without reinventing the wheel. Over the next weeks, I plan to write a couple of focused entries on what I've learned and changed, starting with how a simple form of functional programming should be possible in TableGen. [Less]