Posted
over 5 years
ago
Downloads
If you're curious about the slides, you can download the PDF or
the ODP.
Thanks
This post has been a part of work undertaken by my employer Collabora.
I would like to thank the wonderful organizers of OSSummit NA, the Linux Foundation for hosting a great event.
|
Posted
over 5 years
ago
Window Scaling
One of the ideas we had in creating the compositing mechanism was to
be able to scale window contents for the user -- having the window
contents available as an image provides for lots of flexibility for
presentation.
However, while
... [More]
we've seen things like “overview mode” (presenting
all of the application windows scaled and tiled for easy selection),
we haven't managed to interact with windows in scaled form. That is,
until yesterday.
glxgears thinks the window is only 32x32 pixels in size. xfd is scaled
by a factor of 2. xlogo is drawn at the normal size.
Two Window Sizes
The key idea for window scaling is to have the X server keep track of
two different window sizes -- the sarea occupied by the window within
its parent, and the area available for the window contents, including
descendents. For now, at least, the origin of the window is the same
between these two spaces, although I don't think there's any reason
they would have to be.
Current Size.
This is the size as seen from outside the window, and
as viewed by all clients other than the owner of the window. It
reflects the area within the parent occupied by the window,
including the area which captures pointer events. This can probably
use a better name.
Owner Size. This is the size of the window viewed from inside the
window, and as viewed by the owner of the window. When composited,
the composite pixmap gets allocated at this size. When
automatically composited, the X server will scale the image of the
window from this size to the current size.
Clip Lists
Normally, when computing the clip list for a composited window, the X
server uses the current size of the window (aka the “borderSize” region)
instead of just the porition of the window which is not clipped by the
ancestor or sibling windows. This is how we capture output which is
covered by those windows and can use it to generate translucent
effects.
With an output size set, instead of using the current size, I use the
owner size instead. All un-redirected descendents are thus clipped to
this overall geometry.
Sub Windows
Descendent windows are left almost entirely alone; they keep their
original geometry, both position and size. Because the output sized
window retains its original position, all of the usual coordinate
transformations 'just work'. Of course, the clipping computations will
start with a scaled clip list for the output sized window, so the
descendents will have different clipping. There's suprisingly little
effect otherwise.
Output Handling
When an owner size is set, the window gets compositing enabled. The
composite pixmap is allocate at the owner size instead of the current
size. When no compositing manager is running, the automatic
compositing painting code in the server now scales the output from the
output size to the current size.
Most X applications don't have borders, but I needed to figure out
what to do in case one appeared. I decided that the boarder should be
the same size in the output and current presentations. That's about
the only thing that I could get to make sense; the border is 'outside'
the window size, so if you want to make the window contents twice as
big, you want to make the window size twice as big, not some function
of the border width.
About the only trick was getting the transformation from output size
to current size correct in the presence of borders. That took a few
iterations, but I finally just wrote down a few equations and solved
for the necessary values. Note that Render transforms take destination
space coordinates and generate source space coordinates, so they
appear “backwards”. While Render supports projective transforms, this
one is just scaling and translation, so we just need:
x_output_size = A * x_current_size + B
Now, we want the border width for input and output to be the same,
which means:
border_width + output_size = A * (border_width + current_size) + B
border_width = A * border_width + B
Now we can solve for A:
output_size = A * current_size
A = output_size / current_size
And for B:
border_width = output_size / current_size * border_width + B
B = (1 - output_size / current_size) * border_width
With these, we can construct a suitable transformation matrix:
⎡ Ax 0 Bx ⎤
⎢ 0 Ay By ⎥
⎣ 0 0 1 ⎦
Input Handling
Input device root coordinates need to be adjusted for owner sized
windows. If you nest an owner sized window inside another owner sized
window, then there are two transformations involved.
There are actually two places where these transformations need to be
applied:
To compute which window the pointer is in. If an output sized
window has descendents, then the position of the pointer within
the output window needs to be scaled so that the correct
descendent is identified as containing the pointer.
To compute the correct event coordinates when sending events to
the window. I decided not to attempt to separate the window owner
from other clients for event delivery; all clients see the same
coordinates in events.
Both of these require the ability to transform the event coordinates
relative to the root window. To do that, we translate from root
coordinates to window coordinates, scale by the ratio of output to
current size and then translate back:
void
OwnerScaleCoordinate(WindowPtr pWin, double *xd, double *yd)
{
if (wOwnerSized(pWin)) {
*xd = (*xd - pWin->drawable.x) * (double) wOwnerWidth(pWin) /
(double) pWin->drawable.width + pWin->drawable.x;
*yd = (*yd - pWin->drawable.y) * (double) wOwnerHeight(pWin) /
(double) pWin->drawable.height + pWin->drawable.y;
}
}
This moves the device to the scaled location within the output sized
windows. Performing this transformation from the root window down to
the target window adjusts the position correctly even when there is
more than one output sized window among the window ancestry.
Case 1. is easy; XYToWindow, and the associated miSpriteTrace
function, already traverse the window tree from the root for each
event. Each time we descend through a window, we apply the
transformation so that subsequent checks for descendents will check
the correct coordinates. At each step, I use OwnerScaleCoordinate for
the transformation.
Case 2. means taking an arbitrary window and walking up the window
tree to the root and then performing each transformation on the way
back down. Right now, I'm doing this recursively, but I'm reasonably
sure it could be done iteratively instead:
void
ScaleRootCoordinate(WindowPtr pWin, double *xd, double *yd)
{
if (pWin->parent)
ScaleRootCoordinate(pWin->parent, xd, yd);
OwnerScaleCoordinate(pWin, xd, yd);
}
Events and Replies
To make the illusion for the client work, everything the client hears
about the window needs to be adjusted so that the window seems to be
the owner size and not the current size.
Input events. The root coordinates are modified as described above,
and then the window-relative coordinates are computed as usual—by
subtracting the window origin from the root position. That's
because the windows are all left in their original location.
ConfigureNotify events. These events are rewritten before being
delivered to the owner so that the width and height reflect the
owner size. Because window managers send synthetic configure notify
events when moving windows, I also had to rewrite those events,
or the client would get the wrong size information.
PresentConfigureNotify events. For these, I decided to rewrite the
size values for all clients. As these are intended to be used to
allocate window buffers for presentation, the right size is always
the owner size.
OwnerWindowSizeNotify events. I created a new event so that the
compositing manager could track the owner size of all child
windows. That's necessary because the X server only performs the
output size scaling operation for automatically redirected windows;
if the window is manually redirected, then the compositing manager
will have to perform the scaling operation instead.
GetGeometry replies. These are rewritten for the window owner to
reflect the owner size value. Other clients see the current size
instead.
GetImage replies. I haven't done this part yet, but I think I need
to scale the window image for clients other than the owner. In
particular, xwd currently fails with a Match error when it sees a
window with a non-default visual that has an output size smaller
than the window size. It tries to perform a GetImage operation
using the current size, which fails when the server tries to fetch
that rectangle from the owner-sized window pixmap.
Composite Extension Changes
I've stuck all of this stuff into the Composite extension; mostly
because you need to use Composite to capture the scaled window output
anyways.
12. Composite Events (0.5 and later)
Version 0.5 of the extension defines an event selection mechanism
and a couple of events.
COMPOSITEEVENTTYPE {
CompositePixmapNotify = 0
CompositeOwnerWindowSizeNotify = 1
}
Event type delivered in events
COMPOSITEEVENTMASK {
CompositePixmapNotifyMask = 0x0001
CompositeOwnerWindowSizeNotifyMask = 0x0002
}
Event select mask for CompositeSelectInput
⎡
⎢ CompositeSelectInput
⎢
⎢ window: Window
⎢ enable SETofCOMPOSITEEVENTMASK
⎣
This request selects the set of events that will be delivered to the
client from the specified window.
CompositePixmapNotify
type: CARD8 XGE event type (35)
extension: CARD8 Composite extension request number
sequence-number: CARD16
length: CARD32 0
evtype: CARD16 CompositePixmapNotify
window: WINDOW
windowWidth: CARD16
windowHeight: CARD16
pixmapWidth: CARD16
pixmapHeight: CARD16
This event is delivered whenever the composite pixmap for a window is
created, changed or deleted. When the composite pixmap is deleted,
pixmapWidth and pixmapHeight will be zero. The client can call
NameWindowPixmap to assign a resource ID for the new pixmap.
13. Output Window Size (0.5 and later)
⎡
⎢ CompositeSetOwnerWindowSize
⎢
⎢ window: Window
⎢ width: CARD16
⎢ height: CARD16
⎣
This request specifies that the owner-visible window size will be
set to the provided value, overriding the actual window size as
seen by the owner. If composited, the composite pixmap will be
created at this size. If automatically composited, the server will
scale the output from the owner size to the current window size.
If the window is mapped, an UnmapWindow request is performed
automatically first. Then the owner size is set. A
CompositeOwnerWindowSizeNotify event is then
generated. Finally, if the window was originally mapped, a
MapWindow request is performed automatically.
Setting the width and height to zero will clear the owner size
value and cause the window to resume normal behavior.
Input events will be scaled from the actual window size to the
owner size for all clients.
A Match error is generated if:
The window is a root window
One, but not both, of width/height is zero
And, of course, you can retrieve the current size too:
⎡
⎢ CompositeGetOwnerWindowSize
⎢
⎢ window: Window
⎢
⎢ →
⎢
⎢ width: CARD16
⎢ height: CARD16
⎣
This request returns the current owner window size, if
set. Otherwise it returns 0,0, indicating that there is no owner
window size set.
CompositeOwnerWindowSizeNotify
type: CARD8 XGE event type (35)
extension: CARD8 Composite extension request number
sequence-number: CARD16
length: CARD32 0
evtype: CARD16 CompositeOwnerWindowSizeNotify
window: WINDOW
windowWidth: CARD16
windowHeight: CARD16
ownerWidth: CARD16
ownerHeight: CARD16
This event is generated whenever the owner size of the window is
set. windowWidth and windowHeight report the current window
size. ownerWidth and ownerHeight report the owner window size.
Git repositories
These changes are in various repositories at gitlab.freedesktop.org
all using the “window-scaling” branch:
X server
xorg proto
libXcomposite
xcb proto
And here's a sample command line app which modifies the owner scaling
value for an existing window:
xownersize
Current Status
This stuff is all very new; I started writing code on Friday evening
and got a simple test case working. I then spent Saturday making most
of it work, and today finding a pile of additional cases that needed
handling. I know that GetImage is broken; I'm sure lots of other stuff
is also not quite right.
I'd love to get feedback on whether the API and feature set seem
reasonable or not. [Less]
|
Posted
over 5 years
ago
robertfoss@xps9570 ~/work/libdrm $ git ru
remote: Counting objects: 234, done.
remote: Compressing objects: 100% (233/233), done.
remote: Total 234 (delta 177), reused 0 (delta 0)
Receiving objects: 100% (234/234), 53.20 KiB | 939.00 KiB/s, done.
... [More]
Resolving deltas: 100% (177/177), completed with 36 local objects.
From git://anongit.freedesktop.org/mesa/drm
cb592ac8166e..bcb9d976cd91 master -> upstream/master
* [new tag] libdrm-2.4.93 -> libdrm-2.4.93
* [new tag] libdrm-2.4.94 -> libdrm-2.4.94
The idea here is that we by issuing a single short command can fetch the
latest master branch from the upstream repository of the codebase we're
working on and set our local master branch to point to the most recent
upstream/master one … [Less]
|
Posted
over 5 years
ago
The DRM (direct rendering manager, not the content protection stuff) graphics
subsystem in the linux kernel does not have a generic 2D accelaration API.
Despite an awful lot of of GPUs having more or less featureful blitter
units. And many systems
... [More]
need them for a lot of use-cases, because the 3D engine
is a bit too slow or too power hungry for just rendering desktops.
It’s a FAQ why this doesn’t exist and why it won’t get added, so I figured I’ll
answer this once and for all.
Bit of nomeclatura upfront: A 2D engine (or blitter) is a bit of hardware that
can copy stuff with some knowledge of the 2D layout usually used for pixel
buffers. Some blitters also can do more like basic blending, converting color
spaces or stretching/scaling. A 3D engine on the other hand is the fancy bit of
high performance compute block, which run small programs (called shaders) on
a massively parallel archicture. Generally with huge memory bandwidth and a
dedicated controller to feed this beast through an asynchronous command buffer.
3D engines happen to be really good at rendering the pixels for 3D action games,
among other things.
There’s no 2D Acceleration Standard
3D has it easy: There’s OpenGL and Vulkan and DirectX that require a certain
feature set. And huge market forces that make sure if you use these features
like a game would, rendering is fast.
Aside: This means the 2D engine in a browser actually needs to work like a
3D action game, or the GPU will crawl. The impendence mismatch compared to
traditional 2D rendering designs is huge.
On the 2D side there’s no such thing: Every blitter engine is its own bespoke
thing, with its own features, limitations and performance characteristics.
There’s also no standard benchmarks that would drive common performance
characteristics - today blitters are neeeded mostly in small systems, with very
specific use cases. Anything big enough to run more generic workloads will have
a 3D rendering block anyway. These systems still have blitters, but mostly just
to help move data in and out of VRAM for the 3D engine to consume.
Now the huge problem here is that you need to fill these gaps in various
hardware 2D engines using CPU side software rendering. The crux with any 2D
render design is that transferring buffers and data too often between the GPU
and CPU will kill performance. Usually the cliff is so steep that pure
CPU rendering using only software easily beats any simplistic 2D acceleration
design.
The only way to fix this is to be really careful when moving data between the
CPU and GPU for different rendering operations. Sticking to one side, even if
it’s a bit slower, tends to be an overall win. But these decisions highly depend
upon the exact features and performance characteristics of your 2D engine.
Putting a generic abstraction layer in the middle of this stack, where it’s guaranteed
to be if you make it a part of the kernel/userspace interface, will not result
in actual accelaration.
So either you make your 2D rendering look like it’s a 3D game, using 3D
interfaces like OpenGL or Vulkan. Or you need a software stack that’s bespoke to
your use-case and the specific hardware you want to run on.
2D Accelaration is Really Hard
This is the primary reason really. If you don’t believe that, look at all the
tricks a browser employs to render CSS and HTML and text really fast, while
still animating all that stuff smoothly. Yes, a web-browser is the pinnacle of
current 2D acceleration tech, and you really need all the things in there for
decent performance: Scene graphs, clever render culling, massive batching and
huge amounts of pains to make sure you don’t have to fall back to CPU based
software rendering at the wrong point in a rendering pipeline. Plus managing
all kinds of assorted caches to balance reuse against running out of memory.
Unfortunately lots of people assume 2D must be a lot simpler than 3D rendering,
and therefore they can design a 2D API that’s fast enough for everyone. No one
jumps in and suggests we’ll have a generic 3D interface at the kernel level,
because the lessons there are very clear:
The real application interface is fairly high level, and in userspace.
There’s a huge industry group doing really hard work to specify these
interfaces.
The actual kernel to userspace interfaces ends up being highly specific to the
hardware and architecture of the userspace driver (which contains most of the
magic). Any attempt at a generic interface leaves lots of hardware specific
tricks and hence performance on the floor.
3D APIs like OpenGL or Vulkan have all the batching and queueing and memory
management issues covered in one way or another.
There are a bunch of DRM drivers which have a support for 2D render engines
exposed to userspace. But they all use highly hardware specific interfaces,
fully streamlined for the specific engine. And they all require a decently sized
chunk of driver code in userspace to translate from a generic API to the
hardware formats. This is what DRM maintainers will recommend you to do, if you
submit a patch to add a generic 2D acceleration API.
Exactly like a 3D driver.
If All Else Fails, There’s Options
Now if you don’t care about the last bit of performance, and your use-case is
limited, and your blitter engine is limited, then there’s already options:
You can take whatever pixel buffer you have, export it as a dma-buf, and then
import it into some other subsystem which already has some kind of limited 2D
accelaration support. Depending upon your blitter engine, a v4l2 mem2m device,
or for simpler things there’s also dmaengines.
On top, the DRM subsystem does allow you to implement the traditional
accelaration methods exposed by the fbdev subsystem. In case you have userspace
that really insists on using these; it’s not recommended for anything new.
What about KMS?
The above is kinda a lie, since the KMS (kernel modesetting) IOCTL userspace API
is a fairly full-featured 2D rendering interface. The aim of course is to render
different pixel buffers onto a screen. With the recently added writeback support
operations targetting memory are now possible. This could be used to expose a
traditional blitter, if you only expose writeback support and no other outputs
in your KMS driver.
There’s a few downsides:
KMS is highly geared for compositing just a few buffers (hardware usually has
a very limited set of planes). For accelerated text rendering you want to do a
composite operation for each character, which means this has rather limited
use.
KMS only needs to run at 60Hz, or whatever the refresh rate of your monitor
is. It’s not optimized for efficiency at higher throughput at all.
So all together this isn’t the high-speed 2D accelaration API you’re looking for
either. It is a valid alternative to the options above though, e.g. instead of a
v4l2 mem2m device.
FAQ for the FAQ, or: OpenVG?
OpenVG isn’t the standard you’re looking for either. For one it’s a userspace
API, like OpenGL. All the same reasons for not implementing a generic OpenGL
interface at the kernel/userspace apply to OpenVG, too.
Second, the Mesa3D userspace library did support OpenVG once. Didn’t gain
traction, got canned. Just because it calls itself a standard doesn’t make it a
widely adopted industry default. Unlike OpenGL/Vulkan/DirectX on the 3D side.
Thanks to Dave Airlie and Daniel Stone for reading and commenting on drafts of this
text. [Less]
|
Posted
over 5 years
ago
This is mostly a request for testing, because I've received zero feedback on the patches that I merged a month ago and libinput 1.12 is due to be out. No comments so far on the RC1 and RC2 either, so... well, maybe this gets a bit broader attention
... [More]
so we can address some things before the release. One can hope. Required reading for this article: Observations on trackpoint input data and X server pointer acceleration analysis - part 5. As the blog posts linked above explain, the trackpoint input data is difficult and largely arbitrary between different devices. The previous pointer acceleration libinput had relied on a fixed reporting rate which isn't true at low speeds, so the new acceleration method switches back to velocity-based acceleration. i.e. we convert the input deltas to a speed, then apply the acceleration curve on that. It's not speed, it's pressure, but it doesn't really matter unless you're a stickler for technicalities. Because basically every trackpoint has different random data ranges not linked to anything easily measurable, libinput's device quirks now support a magic multiplier to scale the trackpoint range into something resembling a sane range. This is basically what we did before with the systemd POINTINGSTICK_CONST_ACCEL property except that we're handling this in libinput now (which is where acceleration is handled, so it kinda makes sense to move it here). There is no good conversion from the previous trackpoint range property to the new multiplier because the range didn't really have any relation to the physical input users expected. So what does this mean for you? Test the libinput RCs or, better, libinput from master (because it's stable anyway), or from the Fedora COPR and check if the trackpoint works. If not, check the Trackpoint Configuration page and follow the instructions there. [Less]
|
Posted
over 5 years
ago
GSoC Final Report
Nothing lasts forever, and this also applies for GSoC projects. In this report,
I tried to summarize my experience in the DRI community and my contributions.
Recap the project idea
First, it is important to remember the main
... [More]
subject of my GSoC Project:
The Kernel Mode-Setting (KMS) is a mechanism that enables a process to
command the kernel to set a mode (screen resolution, color depth, and rate)
which is in a range of values supported by graphics cards and the display
screen. Creating a Virtual KMS (VKMS) has benefits. First, it could be used for
testing; second, it can be valuable for running X or Wayland on a headless
machine enabling the use of GPU. This module is similar to VGEM, and in some
ways to VIRTIO. At the moment that VKMS gets mature enough, it will be used to
run i-g-t test cases and to automate userspace testing.
I heard about VKMS in the DRM TODO list and decided to apply for GSoC with this
project. A very talented developer from Saudi Arabia named Haneen Mohammed had
the same idea but applied to the Outreachy program. We worked together with the
desire to push as hard as we can the Virtual KMS.
Overcome the steep learning curve
In my opinion, the main reason for the steep learning curve came from the lack
of background experience in how the graphics stack works. For example, when I
took operating system classes, I studied many things related to schedulers,
memory and disk management, and so forth; on the other hand, I had a 10000-foot
view of graphics systems. After long hours of studying and coding, I started to
understand better how things work. It is incredible all the progress and
advances that the DRI developers brought on the last few years! I wish that the
new versions of the Operating system books have a whole chapter for this
subject.
I still have problems to understand all the mechanisms available in the DRM;
however, now I feel confident on how to read the code/documentation and get
into the details of the DRM subsystem. I have plans to compile all the
knowledge acquired during the project in a series of blog posts.
Contributions
During my work in the GSoC, I send my patches to the DRI mailing list and
constantly got feedback to improve my work; as a result, I rework most of my
patches. The natural and reliable way to track the contribution is by using
“git log –author=”Rodrigo Siqueira” in one of the repositories below:
For DRM patches: git://anongit.freedesktop.org/drm-misc
For patches already applied to Torvalds branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
For IGT patches: git://anongit.freedesktop.org/drm/igt-gpu-tools
In summary, follows the main patches that I got accepted:
drm/vkms: Fix connector leak at the module removal
drm/vkms: Add framebuffer and plane helpers
drm/vkms: Add vblank events simulated by hrtimers
drm/vkms: Add connectors helpers
drm/vkms: Add dumb operations
drm/vkms: Add extra information about vkms
drm/vkms: Add basic CRTC initialization
drm/vkms: Add mode_config initialization
We received two contributions from external people; I reviewed both patches:
drm/vkms: Use new return type vm_fault_t
drm/vkms: Fix the error handling in vkms_init()
I am using IGT to test VKMS, for this reason, I decided to send some
contributions to them. I sent a series of patches for fixing GCC warning:
Fix comparison that always evaluates to false
Avoid truncate string in __igt_lsof_fds
Remove parameter aliases with another argument
Move declaration to the top of the code
Account for NULL character when using strncpy
Make string commands dynamic allocate (waiting for review)
Fix truncate string in the snprintf (waiting for review)
I also sent a patchset with the goal of adding support for forcing a specific
module to be used by IGT tests:
Add support to force specific module load
Increase the string size for a module name (waiting for review)
Add support for forcing specific module (waiting for review)
As a miscellaneous contribution, I created a series of scripts to automate the
workflow of Linux Kernel development. This small project was based on a series
of scripts provided by my mentor, and I hope it can be useful for newcomers.
Follows the project link:
Kworkflow
Work in Progress
I am glad to say that I accomplished all the tasks initially proposed and I did
much more. Now I am working to make VKMS work without vblank. This still a work
in progress but I am confident that I can finish it soon. Finally, it is
important to highlight that my GSoC participation will finish at the end of
August because I traveled for two weeks to join the debconf2018.
Now this is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning - Winston Churchill
GSoC gave me one thing that I was pursuing for a long time: a subsystem in the
Linux Kernel that I can be focused for years. I am delighted that I found a
place to be focused, and I will keep working on VKMS until It is finished.
Finally, the Brazilian government opened a call for encouraging free software
development, and I decided to apply the VKMS project. Last week, I received the
great news that I was selected in the first phase and now I am waiting for the
final results. If everything ends well for me, I will receive funding to work
for 5 months in the VKMS and DRM subsystem.
My huge thanks to…
I received support from many people in the dri-devel channel and mailing list.
I want to thanks everybody for all the support and patience.
I want to thanks Daniel Vetter for all the feedback and assistance in the VKMS
work. I also want to thanks Gustavo Padovan for all the support that he
provided to me (which include some calls with great explanations about the
DRM). Finally, I want to thanks Haneen for all the help and great work.
Reference
Use new return type vm_fault_t
Fix the error handling in vkms_init()
Add support to force specific module load
[Less]
|
Posted
over 5 years
ago
Nothing lasts forever, and this also applies for GSoC projects. In this report,
I tried to summarize my experience in the DRI community and my contributions.
Recap the project idea
First, it is important to remember the main subject of my GSoC
... [More]
Project:
The Kernel Mode-Setting (KMS) is a mechanism that enables a process to
command the kernel to set a mode (screen resolution, color depth, and rate)
which is in a range of values supported by graphics cards and the display
screen. Creating a Virtual KMS (VKMS) has benefits. First, it could be used for
testing; second, it can be valuable for running X or Wayland on a headless
machine enabling the use of GPU. This module is similar to VGEM, and in some
ways to VIRTIO. At the moment that VKMS gets mature enough, it will be used to
run i-g-t test cases and to automate userspace testing.
I heard about VKMS in the DRM TODO list and decided to apply for GSoC with this
project. A very talented developer from Saudi Arabia named Haneen Mohammed had
the same idea but applied to the Outreachy program. We worked together with the
desire to push as hard as we can the Virtual KMS.
Overcome the steep learning curve
In my opinion, the main reason for the steep learning curve came from the lack
of background experience in how the graphics stack works. For example, when I
took operating system classes, I studied many things related to schedulers,
memory and disk management, and so forth; on the other hand, I had a 10000-foot
view of graphics systems. After long hours of studying and coding, I started to
understand better how things work. It is incredible all the progress and
advances that the DRI developers brought on the last few years! I wish that the
new versions of the Operating system books have a whole chapter for this
subject.
I still have problems to understand all the mechanisms available in the DRM;
however, now I feel confident on how to read the code/documentation and get
into the details of the DRM subsystem. I have plans to compile all the
knowledge acquired during the project in a series of blog posts.
Contributions
During my work in the GSoC, I send my patches to the DRI mailing list and
constantly got feedback to improve my work; as a result, I rework most of my
patches. The natural and reliable way to track the contribution is by using
“git log –author=”Rodrigo Siqueira” in one of the repositories below:
For DRM patches: git://anongit.freedesktop.org/drm-misc
For patches already applied to Torvalds branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
For IGT patches: git://anongit.freedesktop.org/drm/igt-gpu-tools
In summary, follows the main patches that I got accepted:
drm/vkms: Fix connector leak at the module removal
drm/vkms: Add framebuffer and plane helpers
drm/vkms: Add vblank events simulated by hrtimers
drm/vkms: Add connectors helpers
drm/vkms: Add dumb operations
drm/vkms: Add extra information about vkms
drm/vkms: Add basic CRTC initialization
drm/vkms: Add mode_config initialization
We received two contributions from external people; I reviewed both patches:
drm/vkms: Use new return type vm_fault_t
drm/vkms: Fix the error handling in vkms_init()
I am using IGT to test VKMS, for this reason, I decided to send some
contributions to them. I sent a series of patches for fixing GCC warning:
Fix comparison that always evaluates to false
Avoid truncate string in __igt_lsof_fds
Remove parameter aliases with another argument
Move declaration to the top of the code
Account for NULL character when using strncpy
Make string commands dynamic allocate (waiting for review)
Fix truncate string in the snprintf (waiting for review)
I also sent a patchset with the goal of adding support for forcing a specific
module to be used by IGT tests:
Add support to force specific module load
Increase the string size for a module name (waiting for review)
Add support for forcing specific module (waiting for review)
As a miscellaneous contribution, I created a series of scripts to automate the
workflow of Linux Kernel development. This small project was based on a series
of scripts provided by my mentor, and I hope it can be useful for newcomers.
Follows the project link:
Kworkflow
Work in Progress
I am glad to say that I accomplished all the tasks initially proposed and I did
much more. Now I am working to make VKMS work without vblank. This still a work
in progress but I am confident that I can finish it soon. Finally, it is
important to highlight that my GSoC participation will finish at the end of
August because I traveled for two weeks to join the debconf2018.
Now this is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning - Winston Churchill
GSoC gave me one thing that I was pursuing for a long time: a subsystem in the
Linux Kernel that I can be focused for years. I am delighted that I found a
place to be focused, and I will keep working on VKMS until It is finished.
Finally, the Brazilian government opened a call for encouraging free software
development, and I decided to apply the VKMS project. Last week, I received the
great news that I was selected in the first phase and now I am waiting for the
final results. If everything ends well for me, I will receive funding to work
for 5 months in the VKMS and DRM subsystem.
My huge thanks to…
I received support from many people in the dri-devel channel and mailing list.
I want to thanks everybody for all the support and patience.
I want to thanks Daniel Vetter for all the feedback and assistance in the VKMS
work. I also want to thanks Gustavo Padovan for all the support that he
provided to me (which include some calls with great explanations about the
DRM). Finally, I want to thanks Haneen for all the help and great work.
Reference
Use new return type vm_fault_t
Fix the error handling in vkms_init()
Add support to force specific module load
[Less]
|
Posted
over 5 years
ago
GSoC Final Report
Nothing lasts forever, and this also applies for GSoC projects. In this report,
I tried to summarize my experience in the DRI community and my contributions.
Recap the project idea
First, it is important to remember the main
... [More]
subject of my GSoC Project:
The Kernel Mode-Setting (KMS) is a mechanism that enables a process to
command the kernel to set a mode (screen resolution, color depth, and rate)
which is in a range of values supported by graphics cards and the display
screen. Creating a Virtual KMS (VKMS) has benefits. First, it could be used for
testing; second, it can be valuable for running X or Wayland on a headless
machine enabling the use of GPU. This module is similar to VGEM, and in some
ways to VIRTIO. At the moment that VKMS gets mature enough, it will be used to
run i-g-t test cases and to automate userspace testing.
I heard about VKMS in the DRM TODO list and decided to apply for GSoC with this
project. A very talented developer from Saudi Arabia named Haneen Mohammed had
the same idea but applied to the Outreachy program. We worked together with the
desire to push as hard as we can the Virtual KMS.
Overcome the steep learning curve
In my opinion, the main reason for the steep learning curve came from the lack
of background experience in how the graphics stack works. For example, when I
took operating system classes, I studied many things related to schedulers,
memory and disk management, and so forth; on the other hand, I had a 10000-foot
view of graphics systems. After long hours of studying and coding, I started to
understand better how things work. It is incredible all the progress and
advances that the DRI developers brought on the last few years! I wish that the
new versions of the Operating system books have a whole chapter for this
subject.
I still have problems to understand all the mechanisms available in the DRM;
however, now I feel confident on how to read the code/documentation and get
into the details of the DRM subsystem. I have plans to compile all the
knowledge acquired during the project in a series of blog posts.
Contributions
During my work in the GSoC, I send my patches to the DRI mailing list and
constantly got feedback to improve my work; as a result, I rework most of my
patches. The natural and reliable way to track the contribution is by using
“git log –author=”Rodrigo Siqueira” in one of the repositories below:
For DRM patches: git://anongit.freedesktop.org/drm-misc
For patches already applied to Torvalds branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
For IGT patches: git://anongit.freedesktop.org/drm/igt-gpu-tools
In summary, follows the main patches that I got accepted:
drm/vkms: Fix connector leak at the module removal
drm/vkms: Add framebuffer and plane helpers
drm/vkms: Add vblank events simulated by hrtimers
drm/vkms: Add connectors helpers
drm/vkms: Add dumb operations
drm/vkms: Add extra information about vkms
drm/vkms: Add basic CRTC initialization
drm/vkms: Add mode_config initialization
We received two contributions from external people; I reviewed both patches:
drm/vkms: Use new return type vm_fault_t
drm/vkms: Fix the error handling in vkms_init()
I am using IGT to test VKMS, for this reason, I decided to send some
contributions to them. I sent a series of patches for fixing GCC warning:
Fix comparison that always evaluates to false
Avoid truncate string in __igt_lsof_fds
Remove parameter aliases with another argument
Move declaration to the top of the code
Account for NULL character when using strncpy
Make string commands dynamic allocate (waiting for review)
Fix truncate string in the snprintf (waiting for review)
I also sent a patchset with the goal of adding support for forcing a specific
module to be used by IGT tests:
Add support to force specific module load
Increase the string size for a module name (waiting for review)
Add support for forcing specific module (waiting for review)
As a miscellaneous contribution, I created a series of scripts to automate the
workflow of Linux Kernel development. This small project was based on a series
of scripts provided by my mentor, and I hope it can be useful for newcomers.
Follows the project link:
Kworkflow
Work in Progress
I am glad to say that I accomplished all the tasks initially proposed and I did
much more. Now I am working to make VKMS work without vblank. This still a work
in progress but I am confident that I can finish it soon. Finally, it is
important to highlight that my GSoC participation will finish at the end of
August because I traveled for two weeks to join the debconf2018.
Now this is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning - Winston Churchill
GSoC gave me one thing that I was pursuing for a long time: a subsystem in the
Linux Kernel that I can be focused for years. I am delighted that I found a
place to be focused, and I will keep working on VKMS until It is finished.
Finally, the Brazilian government opened a call for encouraging free software
development, and I decided to apply the VKMS project. Last week, I received the
great news that I was selected in the first phase and now I am waiting for the
final results. If everything ends well for me, I will receive funding to work
for 5 months in the VKMS and DRM subsystem.
My huge thanks to…
I received support from many people in the dri-devel channel and mailing list.
I want to thanks everybody for all the support and patience.
I want to thanks Daniel Vetter for all the feedback and assistance in the VKMS
work. I also want to thanks Gustavo Padovan for all the support that he
provided to me (which include some calls with great explanations about the
DRM). Finally, I want to thanks Haneen for all the help and great work.
Reference
Use new return type vm_fault_t
Fix the error handling in vkms_init()
Add support to force specific module load
[Less]
|
Posted
over 5 years
ago
libinput made a design decision early on to use physical reference points wherever possible. So your virtual buttons are X mm high/across, the pointer movement is calculated in mm, etc. Unfortunately this exposed us to a large range of devices that
... [More]
don't bother to provide that information or just give us the wrong information to begin with. Patching the kernel for every device is not feasible so in 2015 the 60-evdev.hwdb was born and it has seen steady updates since. Plenty a libinput bug was fixed by just correcting the device's axis ranges or resolution. To take the magic out of the 60-evdev.hwdb, here's a blog post for your perusal, appreciation or, failing that, shaking a fist at. Note that the below is caller-agnostic, it doesn't matter what userspace stack you use to process your input events. There are four parts that come together to fix devices: a kernel ioctl and a trifecta of udev rules hwdb entries and a udev builtin. The kernel's EVIOCSABS ioctlIt all starts with the kernel's struct input_absinfo. struct input_absinfo { __s32 value; __s32 minimum; __s32 maximum; __s32 fuzz; __s32 flat; __s32 resolution;};The three values that matter right now: minimum, maximum and resolution. The "value" is just the most recent value on this axis, ignore fuzz/flat for now. The min/max values simply specify the range of values the device will give you, the resolution how many values per mm you get. Simple example: an x axis given at min 0, max 1000 at a resolution of 10 means your devices is 100mm wide. There is no requirement for min to be 0, btw, and there's no clipping in the kernel so you may get values outside min/max. Anyway, your average touchpad looks like this in evemu-record: # Event type 3 (EV_ABS)# Event code 0 (ABS_X)# Value 2572# Min 1024# Max 5112# Fuzz 0# Flat 0# Resolution 41# Event code 1 (ABS_Y)# Value 4697# Min 2024# Max 4832# Fuzz 0# Flat 0# Resolution 37 This is the information returned by the EVIOCGABS ioctl (EVdev IOCtl Get ABS). It is usually run once on device init by any process handling evdev device nodes. Because plenty of devices don't announce the correct ranges or resolution, the kernel provides the EVIOCSABS ioctl (EVdev IOCtl Set ABS). This allows overwriting the in-kernel struct with new values for min/max/fuzz/flat/resolution, processes that query the device later will get the updated ranges. udev rules, hwdb and builtins The kernel has no notification mechanism for updated axis ranges so the ioctl must be applied before any process opens the device. This effectively means it must be applied by a udev rule. udev rules are a bit limited in what they can do, so if we need to call an ioctl, we need to run a program. And while udev rules can do matching, the hwdb is easier to edit and maintain. So the pieces we have is: a hwdb that knows when to change (and the values), a udev program to apply the values and a udev rule to tie those two together. In our case the rule is 60-evdev.rules. It checks the 60-evdev.hwdb for matching entries [1], then invokes the udev-builtin-keyboard if any matching entries are found. That builtin parses the udev properties assigned by the hwdb and converts them into EVIOCSABS ioctl calls. These three pieces need to agree on each other's formats - the udev rule and hwdb agree on the matches and the hwdb and the builtin agree on the property names and value format. By itself, the hwdb itself has no specific format beyond this: some-match-that-identifies-a-device PROPERTY_NAME=value OTHER_NAME=othervalueBut since we want to match for specific use-cases, our udev rule assembles several specific match lines. Have a look at 60-evdev.rules again, the last rule in there assembles a string in the form of "evdev:name:the device name:content of /sys/class/dmi/id/modalias". So your hwdb entry could look like this: evdev:name:My Touchpad Name:dmi:*svnDellInc* EVDEV_ABS_00=0:1:3If the name matches and you're on a Dell system, the device gets the EVDEV_ABS_00 property assigned. The "evdev:" prefix in the match line is merely to distinguish from other match rules to avoid false positives. It can be anything, libinput unsurprisingly used "libinput:" for its properties. The last part now is understanding what EVDEV_ABS_00 means. It's a fixed string with the axis number as hex number - 0x00 is ABS_X. And the values afterwards are simply min, max, resolution, fuzz, flat, in that order. So the above example would set min/max to 0:1 and resolution to 3 (not very useful, I admit). Trailing bits can be skipped altogether and bits that don't need overriding can be skipped as well provided the colons are in place. So the common use-case of overriding a touchpad's x/y resolution looks like this: evdev:name:My Touchpad Name:dmi:*svnDellInc* EVDEV_ABS_00=::30 EVDEV_ABS_01=::20 EVDEV_ABS_35=::30 EVDEV_ABS_36=::20 0x00 and 0x01 are ABS_X and ABS_Y, so we're setting those to 30 units/mm and 20 units/mm, respectively. And if the device is multitouch capable we also need to set ABS_MT_POSITION_X and ABS_MT_POSITION_Y to the same resolution values. The min/max ranges for all axes are left as-is. The most confusing part is usually: the hwdb uses a binary database that needs updating whenever the hwdb entries change. A call to systemd-hwdb update does that job. So with all the pieces in place, let's see what happens when the kernel tells udev about the device:
The udev rule assembles a match and calls out to the hwdb,
The hwdb applies udev properties where applicable and returns success,
The udev rule calls the udev keyboard-builtin
The keyboard builtin parses the EVDEV_ABS_xx properties and issues an EVIOCSABS ioctl for each axis,
The kernel updates the in-kernel description of the device accordingly
The udev rule finishes and udev sends out the "device added" notification
The userspace process sees the "device added" and opens the device which now has corrected values
Celebratory champagne corks are popping everywhere, hands are shaken, shoulders are patted in congratulations of another device saved from the tyranny of wrong axis ranges/resolutions
Once you understand how the various bits fit together it should be quite easy to understand what happens. Then the remainder is just adding hwdb entries where necessary but the touchpad-edge-detector tool is useful for figuring those out. [1] Not technically correct, the udev rule merely calls the hwdb builtin which searches through all hwdb entries. It doesn't matter which file the entries are in. [Less]
|
Posted
over 5 years
ago
For some time now I been supporting two Linux developers on patreon. Namely Ryan Gordon of Linux game porting and SDL development fame and Tanu Kaskinen who is a lead developer on PulseAudio these days.
One of the things I often think about is how
... [More]
we can enable more people to make a living from working on the Linux desktop and related technologies. If your reading my blog there is a good chance that you are enabling people to make a living on working on the Linux desktop by paying for RHEL Workstation subscriptions through your work. So a big thank you for that. The fact that Red Hat has paying customers for our desktop products is critical in terms of our ability to do so much of the maintenance and development work we do around the Linux Desktop and Linux graphics stack.
That said I do feel we need more venues than just employment by companies such as Red Hat and this is where I would love to see more people supporting their favourite projects and developers through for instance Patreon. Because unlike one of funding campaigns repeat crowdfunding like Patreon can give developers predictable income, which means they don’t have to worry about how to pay their rent or how to feed their kids.
So in terms of the two Patreons I support Ryan is probably the closest to being able to rely on it for his livelihood, but of course more Patreon supporters will enable Ryan to be even less reliant on payments from game makers. And Tanu’s patreon income at the moment is helping him to spend quite a bit of time on PulseAudio, but it is definitely not providing him with a living income. So if you are reading this I strongly recommend that you support Ryan Gordon and Tanu Kaskinen on Patreon. You don’t need to pledge a lot, I think in general it is in fact better to have many people pledging 10 dollars a Month than a few pledging hundreds, because the impact of one person coming or going is thus a lot less. And of course this is not just limited to Ryan and Tanu, search around and see if any projects or developers you personally care deeply about are using crowdfunding and support them, because if more of us did so then more people would be able to make a living of developing our favourite open source software.
Update: Seems I wasn’t the only one thinking about this, Flatpak announced today that application devs can put their crowdfunding information into their flatpaks and it will be advertised in GNOME Software. [Less]
|