I Use This!
Very High Activity

News

Analyzed 22 days ago. based on code collected 4 months ago.
Posted 5 days ago
The All Systems Go! 2018 Call for Participation is Now Open! The Call for Participation (CFP) for All Systems Go! 2018 is now open. We’d like to invite you to submit your proposals for consideration to the CFP submission site. The CFP will close on ... [More] July 30th. Notification of acceptance and non-acceptance will go out within 7 days of the closing of the CFP. All topics relevant to foundational open-source Linux technologies are welcome. In particular, however, we are looking for proposals including, but not limited to, the following topics: Low-level container executors and infrastructure IoT and embedded OS infrastructure BPF and eBPF filtering OS, container, IoT image delivery and updating Building Linux devices and applications Low-level desktop technologies Networking System and service management Tracing and performance measuring IPC and RPC systems Security and Sandboxing While our focus is definitely more on the user-space side of things, talks about kernel projects are welcome, as long as they have a clear and direct relevance for user-space. For more information please visit our conference website! [Less]
Posted 8 days ago
We have been working very hard to make it easy for you to migrate your applications to newer, faster SPARC hardware and Oracle Solaris 11. This post provides an overview of the process and the tools that automate the migration. Migration helps you ... [More] modernize IT assets, lower infrastructure costs through consolidation, and improve performance. Oracle SPARC T8 servers, SPARC M8 servers, and Oracle SuperCluster M8 Engineered Systems serve as perfect consolidation platforms for migrating legacy workloads running on old systems. Applications migrated to faster hardware and Oracle Solaris 11 will automatically deliver better performance without requiring any architecture or code changes. You can migrate your operating environment and applications using both physical-to-virtual (P2V) and virtual-to-virtual (V2V) tools. The target environment can either be configured with Oracle VM for SPARC (LDoms) or Oracle Solaris Zones on the new hardware. You can also migrate to the Dedicated Compute Classic - SPARC Model 300 in Oracle Compute Cloud and benefit from Cloud capabilities. Migration Options In general there are two options for migration. 1) Lift and Shift of Applications to Oracle Solaris 11 The application on the source system is re-hosted on new SPARC hardware running Oracle Solaris 11. If your application is running on Oracle Solaris 10 on the source system, lift and shift of the application is preferred where possible because a full Oracle Solaris 11 stack will perform better and is easier to manage. With the Oracle Solaris Binary Application Guarantee, you will get the full benefits of OS modernization while still preserving your application investment. 2) Lift and Shift of the Whole System The operating environment and application running on the system are lifted as-is and re-hosted in an LDom or Oracle Solaris Zone on target hardware running Oracle Solaris 11 in the control domain or global zone. If you are running Oracle Solaris 10 on the source system and your application has dependencies on Solaris 10 services, you can either migrate to an Oracle Solaris 10 Branded Zone or an Oracle Solaris 10 guest domain on the target. Oracle Solaris 10 Branded Zones help you maintain a Oracle Solaris 10 environment for the application while taking advantage of Oracle Solaris 11 technologies in the global zone on the new SPARC hardware. Migration Phases There are 3 key phases in migration planning and execution. 1) Discovery This includes discovery and assessment of existing physical and virtual machines, their current utilization levels, and dependencies between systems hosting multi-tier applications or running highly available (HA) Oracle Solaris Cluster type configurations. This phase helps you identify the candidate systems for migration and the dependency order for performing the migrations. 2) Size the Target Environment This requires capacity planning of the target environment to accommodate the incoming virtual machines. This takes into account the resource utilization levels on the source machine, performance characteristics of the modern target hardware running Oracle Solaris 11, and the cost savings that result from higher performance. 3) Execute the Migration Migration can be accomplished using P2V and V2V tools for LDoms and Oracle Solaris Zones. We are continually enhancing migration tools and publishing supporting documentation. As a first step in this exercise, we are releasing LDom V2V tools that help users migrate Oracle Solaris 10 or Oracle Solaris 11 guest domains that are running on old SPARC systems to modern hardware running Oracle Solaris 11 in the control domain. One of the migration scenarios is illustrated here.   Three commands are used to perform the LDom V2V migration. 1) ovmtcreate runs on the source machine to create an Open Virtualization Appliance (OVA) file, called an OVM Template. 2) ovmtdeploy runs on the target machine to deploy the guest domain. 3) ovmtconfig runs on the target machine to configure the guest domain.     In the documented example use case, validation is performed using an Oracle Database workload. Database service health is monitored using Oracle Enterprise Manager (EM) Database Express. Migration Resources We have a Lift and Shift Guide that documents the end-to-end migration use case and a White Paper that provides an overview of the process. Both documents are available at: Lift and Shift Documentation Library Stay tuned for more updates on the tools and documentation for LDom and Oracle Solaris Zone migrations for both on-premise deployments and to SPARC Model 300 in Oracle Compute Cloud. Oracle Advanced Customer Services (ACS) offers SPARC Solaris Migration services, and they can assist you with migration planning and execution using the tools developed by Solaris Engineering. [Less]
Posted 9 days ago
The libre Midgard driver, Panfrost has reached a number of new milestones, culminating in the above screenshot, demonstrating: Textures! The bug preventing textures was finally cracked. I implemented support for textures in the compiler and through ... [More] Gallium; I integrated some texture swizzling from limare; et voila, textures. Multiple shaders! Previously, the Midgard, NIR-based compiler and the Gallium driver were separate; the compiler read GLSL from the disk, writing back compiled binaries to disk, and the driver would read these binaries. While this path is still used in the standalone test driver, the Gallium driver is now integrated with the compiler directly, enabling “online compilation”. Among numerous other benefits, multiple shaders can now be used in the same program. The stencil test in the Gallium driver. The scissor test could have been used instead, but the stencil test can generalise to non-rectangular surfaces. Additionally, the mechanics of the stencil test on Midgard hardware is better understood than the scissor test for the time being. Blending (partial support), again through Mesa. Currently, only fixed-function blending is supported; implementing “blend shaders” is low-priority due to their rarity, complexity, and performance. My love for My Little Pony. Screenshot is CC BY-SA, a derivative of PonyToast’s photograph of the Element of Generousity’s voice actress. Warning: the following is ruthlessly technical and contains My Little Pony references. Proceed at your own risk. Textures are by far the most significant addition. Although work decoding their command stream and shader instructions had commenced months ago, we hadn’t managed to replay a texture until May 3, let alone implement support in the driver. The lack of functional textures was the only remaining showstopper. We had poured in long hours debugging it, narrowing down the problem to the command stream, but nothing budged. No permutation of the texture descriptor or the sampler descriptor changed the situation. Yet, everyone was sure that once we figured it out, it would have been something silly in hindsight. It was. OpenGL’s textures in the command stream are controlled by the texture and sampler descriptors, corresponding to Vulkan’s textures and samplers respectively. They were the obvious place to look for bugs. They were not the culprit. Where did the blame lie, then? The shader descriptor. Midgard’s shader descriptor, a block in the command stream which configures a shader, has a number of fields: the address of the compiled shader binary, the number of registers used, the number of attributes/varyings/uniforms, and so forth. I thought that was it. A shader descriptor from a replay with textures looked like this (reformatted for clarity): struct mali_shader_meta shader_meta = { .shader = (shader_memory + 1920) | 5, // XXX shader zero tripped .attribute_count = 0, .varying_count = 1, .uniform_registers = (0 << 20) | 0x20e00, }; That is, the shader code is at shader_memory + 1920; as a fragment shader, it uses no attributes but it does receive a single varying; it does not use any uniforms. All accounted for, right? What’s that comment about, “XXX shader zero tripped”, then? There are frequently fields in the command stream that we observe to always be zero, for various reasons. Sometimes they are there for padding and alignment. Sometimes they correspond to a feature that none of our tests had used yet. In any event, it is distracting for a command stream log to be filled with lines like: .zero0 = 0, .zero1 = 0, .zero2 = 0, .zero3 = 0, In an effort to keep everything tidy, fields that were observed to always be zero are not printed. Instead, the tracer just makes sure that the unprinted fields (which default to zero by the C compiler) are, in fact, equal to zero. If they are not, a warning is printed, stating that a “zero is tripped”, as if the field were a trap. When the reader of the log sees this line, they know that the replay is incomplete, as they are missing a value somewhere; a field was wrongly marked as “always zero”. It was a perfect system. At least, it would have been a perfect system, if I had noticed the warning. I was hyper-focused on the new texture and sampler descriptors, on the memory allocations for the texture memory itself, on the shader binaries – I was hyper-focused on textures that I only skimmed the rest of the log for anomalies. If I had – when I finally did, on that fateful Thursday – I would have realised that the zero was tripped. I would have committed a change like: - if (t->zero1) - panwrap_msg("XXX shader zero tripped\n"); + //if (t->zero1) + // panwrap_msg("XXX shader zero tripped\n"); + panwrap_prop("zero1 = %" PRId16, t->zero1); panwrap_prop("attribute_count = %" PRId16, t->attribute_count); panwrap_prop("varying_count = %" PRId16, t->varying_count); I would have then discovered that “zero1” was mysteriously equal to 65537 for my sample with a texture. And I would have noticed that suddenly, texture replay worked! Everything fell into place from then. Notice that 65537 in decimal is equal to 0x10001 in hex. With some spacing included for clarity, that’s 0x 0001 0001. Alternatively, instead of a single 32-bit word, it can be interpreted as two 16-bit integers: two ones in succession. What two things do we have one of in the command stream? Textures and samplers! Easy to enough to handle in the command stream: mali_ptr shader; - u32 zero1; + + u16 texture_count; + u16 sampler_count; /* Counted as number of address slots (i.e. half-precision vec4's) */ u16 attribute_count; After that, it was just a matter of moving code from the replay into the real driver, writing functions to translate Gallium commands into Midgard structures, implementing a routine in the compiler to translate NIR instructions to Midgard instructions, and a lot of debugging. A week later, all the core code for textures was in place… almost. The other big problem posed by textures is their internal format. In some graphics systems, textures are linear, the most intuitive format; that is, a pixel is accessed in the texture by texture[y*stride + x]. However, for reasons of cache locality, this format is a disaster for a GPU; instead, textures are stored “tiled” or “swizzled”. This article offers a good overview of tiled texture layouts. Texture tiling is great and powerful for hardware. It is less great and powerful for driver writers. Decoding the swizzling algorithm would have been a mammoth task, orthogonal to the command stream and shader work for textures. 3D drivers are complex – textures have three major components that are each orthogonal to each other. It would have been hopeless… if libv had not already decoded the layout when writing limare! The heavy lifting was done, all released under the MIT license. In an afternoon’s work, I extracted the relevant code from limare, cleaned it up a bit, and made it up about 20% faster (Abstract rounding). The resulting algorithm is still somewhat opaque, but it works! In a single thread on my armv7 RK3288 laptop, about 355, RGBA32 1080p textures can be swizzled in 10 seconds flat. I then integrated the swizzling code with the Gallium driver, et voilà, vraimente– non, non, ce fois, c’est vrai – je ne mens pas! – euh, bon, je doivais finir d’autres tâches avant pouvoir démontrer test-cube-textured, mais…. voilà! (Sorry for speaking Prancy.) Textures? Textures! On the Bifrost side, Lyude Paul has continued her work writing an assembler. The parser, a surprisingly complex task given the nuances of the ISA, is now working reliably. Code emission is in nascent stages, and her assembler is now making progress on instruction encoding. The first instructions have almost been emitted. May many more instructions follow. However, an assembler for Bifrost is no good without a free driver to use it with; accordingly, Connor Abbott has continued his work investigating the Bifrost command stream. It continues to demonstrate considerable similarities to Midgard; luckily, much of the driver code will be shareable between the architectures. Like the assembler, this work is still in early stages, implemented in a personal branch, but early results look promising. And a little birdy told me that there might be T880 support in the pipes. [Less]
Posted 10 days ago
We've just released Oracle Solaris 11.3 SRU 32. It's available from My Oracle Support Doc ID 2045311.1, or via 'pkg update' from the support repository at https://pkg.oracle.com/solaris/support . The following components have been updated to address ... [More] security issues: Wireshark has been updated to 2.4.6 ImageMagick has been updated to 6.9.9-40 OpenSSL has been updated to 1.0.2o paramiko has been updated to 2.0.8 python has been updated to 2.7.14 Samba has been updated to 4.7.6 Apache Web Server has been updated to 2.4.33 rsyslog has been updated to 8.15.0 Perl 5.22 has been updated These enhancements have also been added: New recvmmsg() and sendmmsg() system calls have been added to improve network performance Explorer 18.2 is now available SunVTS 8.2.2 is now available netstat -P udp provides more stats for per udp socket SO_REUSEPORT request distribution Full details of this SRU can be found in My Oracle Support Doc 2396704.1. For the list of Service Alerts affecting each Oracle Solaris 11.3 SRU, see Important Oracle Solaris 11.3 SRU Issues (Doc ID 2076753.1). [Less]
Posted 12 days ago
This post is part of a four part series: Part 1, Part 2, Part 3, Part 4. In the first three parts, I covered the X server and synaptics pointer acceleration curves and how libinput compares to the X server pointer acceleration curve. In this post ... [More] , I will compare libinput to the synaptics acceleration curve. Comparison of synaptics and libinput libinput has multiple different pointer acceleration curves, depending on the device. In this post, I will only consider the one used for touchpads. So let's compare the synaptics curve with the libinput curve at the default configurations: Synaptics vs libinput's touchpad profile But this one doesn't tell the whole story, because the touchpad accel for libinput actually changes once we get faster. So here are the same two curves, but this time with the range up to 1000mm/s. Synaptics vs libinput's touchpad profile (full range) These two graphs show that libinput is both very different and similar. Both curves have an acceleration factor less than 1 for the majority of speeds, they both decelerate the touchpad more than accelerating it. synaptics has two factors it sticks to and a short curve, libinput has a short deceleration curve and its plateau is the same or lower than synaptics for the most part. Once the threshold is hit at around 250 mm/s, libinput's acceleration keeps increasing until it hits a maximum much later. So, anything under ~20mm/s, libinput should be the same as synaptics (ignoring the <7mm/s deceleration). For anything less than 250mm/s, libinput should be slower. I say "should be" because that is not actually the case, synaptics is slower so I suspect the server scaling slows down synaptics even further. Hacking around in the libinput code, I found that moving libinput's baseline to 0.2 matches the synaptics cursor's speed. However, AFAIK that scaling depends on the screen size, so your mileage may vary. Comparing configuration settings Let's overlay the libinput speed toggles. In Part 2 we've seen the synaptics toggles and they're open-ended, so it's a bit hard to pick a specific set to go with to compare. I'll be using the same combined configuration options from the diagram there. Synaptics configurations vs libinput speeds And we need the diagram from 0-1000mm/s as well. Synaptics configurations vs libinput speedsThere isn't much I can talk about here in direct comparison, the curves are quite different and the synaptics curves vary greatly with the configuration options (even though the shape remains the same). AnalysisIt's fairly obvious that the acceleration profiles are very different once depart from the default settings. Most notable, only libinput's slowest speed setting matches the 0.2 speed that is the synaptics default setting. In other words, if your touchpad is too fast compared to synaptics, it may not be possible to slow it down sufficiently. Likewise, even at the fastest speed, the baseline is well below the synaptics baseline for e.g. 0.6 [1], so if your touchpad is too slow, you may not be able to speed it up sufficiently (at least for low speeds). That problem won't exist for the maximum acceleration factor, the main question here is simply whether they are too high. Answer: I don't know. So the base speed of the touchpad in libinput needs a higher range, that's IMO a definitive bug that I need to work on. The rest... I don't know. Let's see how we go. [1] A configuration I found suggested in some forum when googling for MinSpeed, so let's assume there's at least one person out there using it. [Less]
Posted 12 days ago
This post is part of a four part series: Part 1, Part 2, Part 3, Part 4. In Part 1 and Part 2 I showed the X server acceleration code as used by the evdev and synaptics drivers. In this part, I'll show how it compares against libinput. Comparison ... [More] to libinputlibinput has multiple different pointer acceleration curves, depending on the device. In this post, I will only consider the default one used for mice. A discussion of the touchpad acceleration curve comes later. So, back to the graph of the simple profile. Let's overlay this with the libinput pointer acceleration curve: Classic vs libinput's profile Turns out the pointer acceleration curve, mostly modeled after the xserver behaviour roughly matches the xserver behaviour. Note that libinput normalizes to 1000dpi (provided MOUSE_DPI is set correctly) and thus the curves only match this way for 1000dpi devices. libinput's deceleration is slightly different but I doubt it is really noticeable. The plateau of no acceleration is virtually identical, i.e. at slow speeds libinput moves like the xserver's pointer does. Likewise for speeds above ~33mm/s, libinput and the server accelerate by the same amount. The actual curve is slightly different. It is a linear curve (I doubt that's noticeable) and it doesn't have that jump in it. The xserver acceleration maxes out at roughly 20mm/s. The only difference in acceleration is for the range of 10mm/s to 33mm/s. 30mm/s is still a relatively slow movement (just move your mouse by 30mm within a second, it doesn't feel fast). This means that for all but slow movements, the current server and libinput acceleration provides but a flat acceleration at whatever the maximum acceleration is set to. Comparison of configuration optionsThe biggest difference libinput has to the X server is that it exposes a single knob of normalised continuous configuration (-1.0 == slowest, 1.0 == fastest). It relies on settings like MOUSE_DPI to provide enough information to map a device into that normalised range. Let's look at the libinput speed settings and their effect on the acceleration profile (libinput 1.10.x). libinput speed settings libinput's speed setting is a combination of changing thresholds and accel at the same time. The faster you go, the sooner acceleration applies and the higher the maximum acceleration is. For very slow speeds, libinput provides deceleration. Noticeable here though is that the baseline speed is the same until we get to speed settings of less than -0.5 (where we have an effectively flat profile anyway). So up to the (speed-dependent) threshold, the mouse speed is always the same. Let's look at the comparison of libinput's speed setting to the accel setting in the simple profile: Comparison of libinput speed and accel settings Clearly obvious: libinput's range is a lot smaller than what the accel setting allows (that one is effectively unbounded). This obviously applies to the deceleration as well: Comparison of libinput speed and deceleration I'm not posting the threshold comparison, as Part 1 shows it does not effect the maximum acceleration factor anyway. AnalysisSo, where does this leave us? I honestly don't know. The curves are different but the only paper I could find on comparing acceleration curves is Casiez and Roussel' 2011 UIST paper. It provides a comparison of the X server acceleration with the Windows and OS X acceleration curves [1]. It shows quite a difference between the three systems but the authors note that no specific acceleration curve is definitely superior. However, the most interesting bit here is that both the Windows and the OS X curve seem to be constant acceleration (with very minor changes) rather than changing the curve shape. Either way, there is one possible solution for libinput to implement: to change the base plateau with the speed. Otherwise libinput's acceleration curve is well defined for the configurable range. And a maximum acceleration factor of 3.5 is plenty for a properly configured mouse (generally anything above 3 is tricky to control). AFAICT, the main issues with pointer acceleration come from mice that either don't have MOUSE_DPI set or trackpoints which are, unfortunately, a completely different problem. I'll probably also give the windows/OS X approaches a try (i.e. same curve, different constant deceleration) and see how that goes. If it works well, that may be a a solution because it's easier to scale into a large range. Otherwise, *shrug*, someone will have to come with a better solution. [1] I've never been able to reproduce the same gain (== factor) but at least the shape and x axis seems to match. [Less]
Posted 12 days ago
This post is part of a four part series: Part 1, Part 2, Part 3, Part 4. In Part 1 I showed the X server acceleration code as used by the evdev driver (which leaves all acceleration up to the server). In this part, I'll show the acceleration code as ... [More] used by the synaptics touchpad driver. This driver installs a device-specific acceleration profile but beyond that the acceleration is... difficult. The profile itself is not necessarily indicative of the real movement, the coordinates are scaled between device-relative, device-absolute, screen-relative, etc. so often that it's hard to keep track of what the real delta is. So let's look at the profile only. Diagram generationDiagrams were generated by gnuplot, parsing .dat files generated by the ptrveloc tool in the git repo. Helper scripts to regenerate all data are in the repo too. Default values unless otherwise specified: MinSpeed: 0.4 MaxSpeed: 0.7 AccelFactor: 0.04 dpi: 1000 (used for converting units to mm) All diagrams are limited to 100 mm/s and a factor of 5 so they are directly comparable. From earlier testing I found movements above over 300 mm/s are rare, once you hit 500 mm/s the acceleration doesn't really matter that much anymore, you're going to hit the screen edge anyway. The choice of 1000 dpi is a difficult one. It makes the diagrams directly comparable to those in Part 1but touchpads have a great variety in their resolution. For example, an ALPS DualPoint touchpad may have resolutions of 25-32 units/mm. A Lenovo T440s has a resolution of 42 units/mm over PS/2 but 20 units/mm over the newer SMBus/RMI4 protocol. This is the same touchpad. Overall it doesn't actually matter that much though, see below. The acceleration profileThis driver has a custom acceleration profile, configured by the MinSpeed, MaxSpeed and AccelFactor options. The former two put a cap on the factor but MinSpeed also adjusts (overwrites) ConstantDeceleration. The AccelFactor defaults to a device-specific size based on the device diagonal. Let's look at the defaults of 0.4/0.7 for min/max and 0.04 (default on my touchpad) for the accel factor: The synaptics acceleration profile The simple profile from part 1 is shown in this graph for comparison. The synaptics profile is printed as two curves, one for the profile output value and one for the real value used on the delta. Unlike the simple profile you cannot configure ConstantDeceleration separately, it depends on MinSpeed. Thus the real acceleration factor is always less than 1, so the synaptics driver doesn't accelerate as such, it controls how much the deltas are decelerated. The actual acceleration curve is just a plain old linear interpolation between the min and max acceleration values. If you look at the curves closer you'll find that there is no acceleration up to 20mm/s and flat acceleration from 25mm/s onwards. Only in this small speed range does the driver adjust its acceleration based on input speed. Whether this is in intentional or just happened, I don't know. The accel factor depends on the touchpad x/y axis. On my T440s using PS/2, the factor defaults to 0.04. If I get it to use SMBus/RMI4 instead of PS/2, that same device has an accel factor of 0.09. An ALPS touchpad may have a factor of 0.13, based on the min/max values for the x/y axes. These devices all have different resolutions though, so here are the comparison graphs taking the axis range and the resolution into account: Acceleration curves on different touchpads The diagonal affects the accel factor, so these three touchpads (two curves are the same physical touchpad, just using a different bus) get slightly different acceleration curves. They're more similar than I expected though and for the rest of this post we can get away we just looking at the 0.04 default value from my touchpad. Note that due to how synaptics is handled in the server, this isn't the whole story, there is more coordinate scaling etc. happening after the acceleration code. The synaptics acceleration profile also does not acccommodate for uneven x/y resolutions, this is handled in the server afterwards. On touchpads with uneven resolutions the velocity thus depends on the vector, moving along the x axis provides differently sized deltas than moving along the y axis. However, anything applied later isn't speed dependent but merely a constant scale, so these curves are still a good representation of what happens. The effect of configurations What does the acceleration factor do? It changes when acceleration kicks in and how steep the acceleration is. Effect of the AccelFactor configuration option And how do the min/max values play together? Let's adjust MinSpeed but leave MaxSpeed at 0.7. Effect of the MinSpeed configuration option MinSpeed lifts the baseline (i.e. the minimum acceleration factor), somewhat expected from a parameter named this way. But it looks again like we have a bug here. When MinSpeed and MaxSpeed are close together, our acceleration actually decreases once we're past the threshold. So counterintuitively, a higher MinSpeed can result in a slower cursor once you move faster. MaxSpeed is not too different here: Effect of the MaxSpeed configuration option The same bug is present, if the MaxSpeed is smaller or close to MinSpeed, our acceleration actually goes down. A quick check of the sources didn't indicate anything enforcing MinSpeed < MaxSpeed either. But otherwise MaxSpeed lifts the maximum acceleration factor. These graphs look at the options in separation, in reality users would likely configure both MinSpeed and MaxSpeed at the same time. Since both have an immediate effect on pointer movement, trial and error configuration is simple and straightforward. Below is a graph of all three adjusted semi-randomly: Effect of all three configuration options combined No suprises in there, the baseline (and thus slowest speed) changes, the maximum acceleration changes and how long it takes to get there changes. The curves vary quite a bit though, so without knowing the configuration options, it's impossible to predict how a specific touchpad behaves. EpilogueThe graphs above show the effect of configuration options in the synaptics driver. I purposely didn't put any specific analysis in and/or compare it to libinput. That comes in a future post. [Less]
Posted 12 days ago
This post is part of a four part series: Part 1, Part 2, Part 3, Part 4. Over the last few days, I once again tried to tackle pointer acceleration. After all, I still get plenty of complaints about how terrible libinput is and how the world was so ... [More] much better without it. So I once more tried to understand the X server's pointer acceleration code. Note: the evdev driver doesn't do any acceleration, it's all handled in the server. Synaptics will come in part two, so this here focuses mostly on pointer acceleration for mice/trackpoints. After a few failed attempts of live analysis [1], I finally succeeded extracting the pointer acceleration code into something that could be visualised. That helped me a great deal in going back and understanding the various bits and how they fit together. The approach was: copy the ptrveloc.(c|h) files into a new project, set up a meson.build file, #define all the bits that are assumed to be there and voila, here's your library. Now we can build basic analysis tools provided we initialise all the structs the pointer accel code needs correctly. I think I succeeded. The git repo is here if anyone wants to check the data. All scripts to generate the data files are in the repository. A note on language: the terms "speed" and "velocity" are subtly different but for this post the difference doesn't matter. The code uses "velocity" but "speed" is more natural to talk about, so just assume equivalence. The X server acceleration codeThere are 15 configuration options for pointer acceleration (ConstantDeceleration, AdaptiveDeceleration, AccelerationProfile, ExpectedRate, VelocityTrackerCount, Softening, VelocityScale, VelocityReset, VelocityInitialRange, VelocityRelDiff, VelocityAbsDiff, AccelerationProfileAveraging, AccelerationNumerator, AccelerationDenominator, AccelerationThreshold). Basically, every number is exposed as configurable knob. The acceleration code is a product of a time when we were handing out configuration options like participation medals at a children's footy tournament. Assume that for the rest of this blog post, every behavioural description ends with "unless specific configuration combinations apply". In reality, I think only four options are commonly used: AccelerationNumerator, AccelerationDenominator, AccelerationThreshold, and ConstantDeceleration. These four have immediate effects on the pointer movement and thus it's easy to do trial-and-error configuration. The server has different acceleration profiles (called the 'pointer transfer function' in the literature). Each profile is a function that converts speed into a factor. That factor is then combined with other things like constant deceleration, but eventually our output delta forms as: deltaout(x, y) = deltain(x, y) * factor * decelerationThe output delta is passed back to the server and the pointer saunters over by few pixels, happily bumping into any screen edge on the way. The input for the acceleration profile is a speed in mickeys, a threshold (in mickeys) and a max accel factor (unitless). Mickeys are a bit tricky. This means the acceleration is device-specific, the deltas for a mouse at 1000 dpi are 20% larger than the deltas for a mouse at 800 dpi (assuming same physical distance and speed). The "Resolution" option in evdev can work around this, but by default this means that the acceleration factor is (on average) higher for high-resolution mice for the same physical movement. It also means that that xorg.conf snippet you found on stackoverflow probably does not do the same on your device. The second problem with mickeys is that they require a frequency to map to a physical speed. If a device sends events every N ms, delta/N gives us a speed in units/ms. But we need mickeys for the profiles. Devices generally have a fixed reporting rate and the speed of each mickey is the same as (units/ms * reporting rate). This rate defaults to 10 in the server (the VelocityScaling default value) and thus matches a device reporting at 100Hz (a discussion of this comes later). All graphs below were generated with this default value. Back to the profile function and how it works: The threshold(usually) defines the mimimum speed at which acceleration kicks in. The max accel factor (usually) limits the acceleration. So the simplest algorithm is if (velocity < threshold) return base_velocity; factor = calculate_factor(velocity); if (factor > max_accel) return max_accel; return factor;In reality, things are somewhere between this simple and "whoops, what have we done". Diagram generationDiagrams were generated by gnuplot, parsing .dat files generated by the ptrveloc tool in the git repo. Helper scripts to regenerate all data are in the repo too. Default values unless otherwise specified: threshold: 4 accel: 2 dpi: 1000 (used for converting units to mm) constant deceleration: 1 profile: classic All diagrams are limited to 100 mm/s and a factor of 5 so they are directly comparable. From earlier testing I found movements above over 300 mm/s are rare, once you hit 500 mm/s the acceleration doesn't really matter that much anymore, you're going to hit the screen edge anyway. Acceleration profilesThe server provides a number of profiles, but I have seen very little evidence that people use anything but the default "Classic" profile. Synaptics installs a device-specific profile. Below is a comparison of the profiles just so you get a rough idea what each profile does. For this post, I'll focus on the default Classic only. Comparison of the different profiles First thing to point out here that if you want to have your pointer travel to Mars, the linear profile is what you should choose. This profile is unusable without further configuration to bring the incline to a more sensible level. Only the simple and limited profiles have a maximum factor, all others increase acceleration indefinitely. The faster you go, the more it accelerates the movement. I find them completely unusable at anything but low speeds. The classic profile transparently maps to the simple profile, so the curves are identical. Anyway, as said above, profile changes are rare. The one we care about is the default profile: the classic profile which transparently maps to the simple profile (SimpleSmoothProfile() in the source). The default profile Looks like there's a bug in the profile formula. At the threshold value it jumps from 1 to 1.5 before the curve kicks in. This code was added in ~2008, apparently no-one noticed this in a decade. The profile has deceleration (accel factor < 1 and thus decreasing the deltas) at slow speeds. This provides extra precision at slow speeds without compromising pointer speed at higher physical speeds. The effect of config optionsOk, now let's look at the classic profile and the configuration options. What happens when we change the threshold? Different thresholds First thing that sticks out: one of these is not like the others. The classic profile changes to the polynomial profile at thresholds less than 1.0. *shrug* I think there's some historical reason, I didn't chase it up. Otherwise, the threshold not only defines when acceleration starts kicking in but it also affects steepness of the curve. So higher threshold also means acceleration kicks in slower as the speed increases. It has no effect on the low-speed deceleration. What happens when we change the max accel factor? This factor is actually set via the AccelerationNumerator and AccelerationDenominator options (because floats used to be more expensive than buying a house). At runtime, the Xlib function of your choice is XChangePointerControl(). That's what all the traditional config tools use (xset, your desktop environment pre-libinput, etc.). Different maximum acceleration First thing that sticks out: one is not like the others. When max acceleration is 0, the factor is always zero for speeds exceeding the threshold. No user impact though, the server discards factors of 0.0 and leaves the input delta as-is. Otherwise it's relatively unexciting, it changes the maximum acceleration without changing the incline of the function. And it has no effect on deceleration. Because the curves aren't linear ones, they don't overlap 100% but meh, whatever. The higher values are cut off in this view, but they just look like a larger version of the visible 2 and 4 curves. Next config option: ConstantDeceleration. This one is handled outside of the profile but at the code is easy-enough to follow, it's a basic multiplier applied together with the factor. (I cheated and just did this in gnuplot directly) Different deceleration Easy to see what happens with the curve here, it simply stretches vertically without changing the properties of the curve itself. If the deceleration is greater than 1, we get constant acceleration instead. All this means with the default profile, we have 3 ways of adjusting it. What we can't directly change is the incline, i.e. the actual process of acceleration remains the same. Velocity calculationAs mentioned above, the profile applies to a velocity so obviously we need to calculate that first. This is done by storing each delta and looking at their direction and individual velocity. As long as the direction remains roughly the same and the velocity between deltas doesn't change too much, the velocity is averaged across multiple deltas - up to 16 in the default config. Of course you can change whether this averaging applies, the max time deltas or velocity deltas, etc. I'm honestly not sure anyone ever used any of these options intentionally or with any real success. Velocity scaling was explained above (units/ms * reporting rate). The default value for the reporting rate is 10, equivalent to 100Hz. Of the 155 frequencies currently defined in 70-mouse.hwdb, only one is 100 Hz. The most common one here is 125Hz, followed by 1000Hz followed by 166Hz and 142Hz. Now, the vast majority of devices don't have an entry in the hwdb file, so this data does not represent a significant sample set. But for modern mice, the default velocity scale of 10 is probably off between 25% and a factor 10. While this doesn't mean much for the local example (users generally just move the numbers around until they're happy enough) it means that the actual values are largely meaningless for anyone but those with the same hardware.Of note: the synaptics driver automatically sets VelocityScale to 80Hz. This is correct for the vast majority of touchpads. EpilogueThe graphs above show the X server's pointer acceleration for mice, trackballs and other devices and the effects of the configuration toggles. I purposely did not put any specific analysis in and/or comparison to libinput. That will come in a future post. [1] I still have a branch somewhere where the server prints yaml to the log file which can then be extracted by shell scripts, passed on to python for processing and ++++ out of cheese error. redo from start ++++ [Less]
Posted 16 days ago
Read Part 1, Part 2 and Part 3 first. In the first three parts, I covered the X server and synaptics pointer acceleration curves and how libinput compares to the X server pointer acceleration curve. In this post, I will compare libinput to the ... [More] synaptics acceleration curve. Comparison of synaptics and libinput libinput has multiple different pointer acceleration curves, depending on the device. In this post, I will only consider the one used for touchpads. So let's compare the synaptics curve with the libinput curve at the default configurations: Synaptics vs libinput's touchpad profile But this one doesn't tell the whole story, because the touchpad accel for libinput actually changes once we get faster. So here are the same two curves, but this time with the range up to 1000mm/s. Synaptics vs libinput's touchpad profile (full range) These two graphs show that libinput is both very different and similar. Both curves have an acceleration factor less than 1 for the majority of speeds, they both decelerate the touchpad more than accelerating it. synaptics has two factors it sticks to and a short curve, libinput has a short deceleration curve and its plateau is the same or lower than synaptics for the most part. Once the threshold is hit at around 250 mm/s, libinput's acceleration keeps increasing until it hits a maximum much later. So, anything under ~20mm/s, libinput should be the same as synaptics (ignoring the <7mm/s deceleration). For anything less than 250mm/s, libinput should be slower. I say "should be" because that is not actually the case, synaptics is slower so I suspect the server scaling slows down synaptics even further. Hacking around in the libinput code, I found that moving libinput's baseline to 0.2 matches the synaptics cursor's speed. However, AFAIK that scaling depends on the screen size, so your mileage may vary. Comparing configuration settings Let's overlay the libinput speed toggles. In Part 2 we've seen the synaptics toggles and they're open-ended, so it's a bit hard to pick a specific set to go with to compare. I'll be using the same combined configuration options from the diagram there. Synaptics configurations vs libinput speeds And we need the diagram from 0-1000mm/s as well. Synaptics configurations vs libinput speedsThere isn't much I can talk about here in direct comparison, the curves are quite different and the synaptics curves vary greatly with the configuration options (even though the shape remains the same). AnalysisIt's fairly obvious that the acceleration profiles are very different once depart from the default settings. Most notable, only libinput's slowest speed setting matches the 0.2 speed that is the synaptics default setting. In other words, if your touchpad is too fast compared to synaptics, it may not be possible to slow it down sufficiently. Likewise, even at the fastest speed, the baseline is well below the synaptics baseline for e.g. 0.6 [1], so if your touchpad is too slow, you may not be able to speed it up sufficiently (at least for low speeds). That problem won't exist for the maximum acceleration factor, the main question here is simply whether they are too high. Answer: I don't know. So the base speed of the touchpad in libinput needs a higher range, that's IMO a definitive bug that I need to work on. The rest... I don't know. Let's see how we go. [1] A configuration I found suggested in some forum when googling for MinSpeed, so let's assume there's at least one person out there using it. [Less]
Posted 16 days ago
Read Part 1 and Part 2 first. In Part 1 and Part 2 I showed the X server acceleration code as used by the evdev and synaptics drivers. In this part, I'll show how it compares against libinput. Comparison to libinputlibinput has multiple different ... [More] pointer acceleration curves, depending on the device. In this post, I will only consider the default one used for mice. A discussion of the touchpad acceleration curve comes later. So, back to the graph of the simple profile. Let's overlay this with the libinput pointer acceleration curve: Classic vs libinput's profile Turns out the pointer acceleration curve, mostly modeled after the xserver behaviour roughly matches the xserver behaviour. Note that libinput normalizes to 1000dpi (provided MOUSE_DPI is set correctly) and thus the curves only match this way for 1000dpi devices. libinput's deceleration is slightly different but I doubt it is really noticeable. The plateau of no acceleration is virtually identical, i.e. at slow speeds libinput moves like the xserver's pointer does. Likewise for speeds above ~33mm/s, libinput and the server accelerate by the same amount. The actual curve is slightly different. It is a linear curve (I doubt that's noticeable) and it doesn't have that jump in it. The xserver acceleration maxes out at roughly 20mm/s. The only difference in acceleration is for the range of 10mm/s to 33mm/s. 30mm/s is still a relatively slow movement (just move your mouse by 30mm within a second, it doesn't feel fast). This means that for all but slow movements, the current server and libinput acceleration provides but a flat acceleration at whatever the maximum acceleration is set to. Comparison of configuration optionsThe biggest difference libinput has to the X server is that it exposes a single knob of normalised continuous configuration (-1.0 == slowest, 1.0 == fastest). It relies on settings like MOUSE_DPI to provide enough information to map a device into that normalised range. Let's look at the libinput speed settings and their effect on the acceleration profile (libinput 1.10.x). libinput speed settings libinput's speed setting is a combination of changing thresholds and accel at the same time. The faster you go, the sooner acceleration applies and the higher the maximum acceleration is. For very slow speeds, libinput provides deceleration. Noticeable here though is that the baseline speed is the same until we get to speed settings of less than -0.5 (where we have an effectively flat profile anyway). So up to the (speed-dependent) threshold, the mouse speed is always the same. Let's look at the comparison of libinput's speed setting to the accel setting in the simple profile: Comparison of libinput speed and accel settings Clearly obvious: libinput's range is a lot smaller than what the accel setting allows (that one is effectively unbounded). This obviously applies to the deceleration as well: Comparison of libinput speed and deceleration I'm not posting the threshold comparison, as Part 1 shows it does not effect the maximum acceleration factor anyway. AnalysisSo, where does this leave us? I honestly don't know. The curves are different but the only paper I could find on comparing acceleration curves is Casiez and Roussel' 2011 UIST paper. It provides a comparison of the X server acceleration with the Windows and OS X acceleration curves [1]. It shows quite a difference between the three systems but the authors note that no specific acceleration curve is definitely superior. However, the most interesting bit here is that both the Windows and the OS X curve seem to be constant acceleration (with very minor changes) rather than changing the curve shape. Either way, there is one possible solution for libinput to implement: to change the base plateau with the speed. Otherwise libinput's acceleration curve is well defined for the configurable range. And a maximum acceleration factor of 3.5 is plenty for a properly configured mouse (generally anything above 3 is tricky to control). AFAICT, the main issues with pointer acceleration come from mice that either don't have MOUSE_DPI set or trackpoints which are, unfortunately, a completely different problem. I'll probably also give the windows/OS X approaches a try (i.e. same curve, different constant deceleration) and see how that goes. If it works well, that may be a a solution because it's easier to scale into a large range. Otherwise, *shrug*, someone will have to come with a better solution. [1] I've never been able to reproduce the same gain (== factor) but at least the shape and x axis seems to match. [Less]