I Use This!
Activity Not Available

News

Analyzed 12 months ago. based on code collected about 1 year ago.
Posted 3 days ago
As mentioned in my previous blog, the X.Org Foundation now wants us to blog every week. Whilst that means shorter blogs (last week’s was a tad long), it also means that there isn’t much to blog about if I didn’t do much in a week. Such is the case ... [More] for this week, sadly. It’s the last week of university, and so there are a few assignment deadlines that I needed to complete; I haven’t been able to invest as much time into my project as I would have wanted. [Less]
Posted 3 days ago
In this week’s article for my ongoing Google Summer of Code (GSoC) project I planned on writing about the basic idea behind the project, but I reconsidered and decided to first give an overview on how Xwayland functions on a high-level and in the ... [More] next week take a look at its inner workings in detail. The reason for that is, that there is not much Xwayland documentation available right now. So these two articles are meant to fill this void in order to give interested beginners a helping hand. And in two weeks I’ll catch up on explaining the project’s idea. As we go high level this week the first question is, what is Xwayland supposed to achieve at all? You may know this. It’s something in a Wayland session ensuring that applications, which don’t support Wayland but only the old Xserver still function normally, i.e. it ensures backwards compatibility. But how does it do this? Before we go into this, there is one more thing to talk about, since I called Xwayland only something before. What is Xwayland exactly? How does it look to you on your Linux system? We’ll see in the next week that it’s not as easy to answer as the following simple explanation makes it appear, but for now this is enough: It’s a single binary containing an Xserver with a special backend written to communicate with the Wayland compositor active on your system - for example with KWin in a Plasma Wayland session. To make it more tangible let’s take a look at Debian: There is a package called Xwayland and it consists of basically only the aforementioned binary file. This binary gets copied to /usr/bin/Xwayland. Compare this to the normal Xserver provided by X.org, which in Debian you can find in the package xserver-xorg-core. The respective binary gets put into /usr/bin/Xorg together with a symlink /usr/bin/X pointing to it. While the latter is the central building block in an X session and therefore gets launched before anything else with graphical output, the Xserver in the Xwayland binary works differently: It is embedded in a Wayland session. And in a Wayland session the Wayland compositor is the central building block. This means in particular that the Wayland compositor also takes up the role of being the server, who talks to Wayland native applications with graphical output as its clients. They send request to it in order to present their painted stuff on the screen. The Xserver in the Xwayland binary is only a necessary link between applications, which are only able to speak to an Xserver, and the Wayland compositor/server. Therefore the Xwayland binary gets launched later on by the compositor or some other process in the workspace. In Plasma it’s launched by KWin after the compositor has initialized the rendering pipeline. You find the relevant code here. Although in this case KWin also establishes some communication channels with the newly created Xwayland process, in general the communication between Xwayland and a Wayland server is done by the normal Wayland protocoll in the same way other native Wayland applications talk to the compositor/server. This means the windows requested by possibly several X based applications and provided by Xwayland acting as an Xserver are translated at the same time by Xwayland to Wayland compatible objects and, acting as a native Wayland client, send to the Wayland compositor via the Wayland protocol. These windows look to the Wayland compositor just like the windows - in Wayland terminology surfaces - of every other Wayland native application. When reading this keep in mind, that an application in Wayland is not limited to using only one window/surface but can create multiple at the same time, so Xwayland as a native Wayland client can do the same for all the windows created for all of its X clients. In the second part next week we’ll have a close look at the Xwayland code to see how Xwayland fills its role as an Xserver in regards to its X based clients and at the same time acts as a Wayland client when facing the Wayland compositor. [Less]
Posted 6 days ago
Felt it been to long since I did another Fedora Workstation update. We spend a lot of time trying to figure out how we can best spend our resources to produce the best desktop possible for our users, because even though Red Hat invests more into the ... [More] Linux desktop than any other company by quite a margin, our resources are still far from limitless. So we have a continuous effort of asking ourselves if each of the areas we are investing in are the right ones that give our users the things they need the most, so below is a sampling of the things we are working on. Improving integration of the NVidia binary driver This has been ongoing for quite a while, but things have started to land now. Hans de Goede and Simone Caronni has been collaboring, building on the work by NVidia and Adam Jackson around glvnd. So if you set up Simones NVidia repository hosted on negativo17 you will be able to install the Nvidia driver without any conflicts with the Mesa stack and due to Hans work you should be fairly sure that even if the NVidia driver stops working with a given kernel update you will smoothly transition back to the open source Nouveau driver. I been testing it on my own Lenovo P70 system for the last week and it seems to work well under X. That said once you install the binary NVidia driver that is what your running on, which is of course not the behaviour you want from a hybrid graphics system. Fixing that last issue requires further collaboration between us and NVidia. Related to this Adam Jackson is currently working on a project he calls glxmux. glxmux will allow you to have more than one GLX implementation on the system, so that you can switch between Mesa GLX for the Intel integrated graphics card and NVidia GLX for the binary driver. While we can make no promises we hope to have the framework in place for Fedora Workstation 27. Having that in place should allow us to create a solution where you only use the NVidia driver when you want the extra graphics power which will of course require significant work from Nvidia to enable it on their side so I can’t give a definite timeline for when all the puzzle pieces are in place. Just be assured we are working on it and talking regularly to NVidia about it. I will let you know here as soon as things come together. On the Wayland side the Jonas Ådahl is working on putting the final touches on Hybrid Graphics support Fleet Commander ready for take-off Another major project we been working on for a long time in Fleet Commander. Fleet Commander is a tool to allow you to manage Fedora and RHEL desktops centrally. This is a tool targeted at for instance Universities or companies with tens, hundreds or thousands of workstation installation. It gives you a graphical browser based UI (accessible through Cockpit) to create configuration profiles and deploy across your organization. Currently it allows you to control anything that has a gsetting associated with it like enabling/disabling extensions and setting configuration settings in GTK+ and GNOME applications. It allows you to configure Network Manager settings so if your updating the company VPN or proxy settings you can easily push those changes out to all user in the organization. Or quickly migrate Evolution email settings to a new email server. The tool also allows you to control recommended applications in the Software Center and set bookmarks in Firefox. There is also support for controlling settings inside LibreOffice. All this features can be set and controlled on either a user level or a group level or organization wide due to the close integration we have with FreeIPA suite of tools. The data is stored inside your organizations LDAP server alongside other user information so you don’t need to have the clients connect to a new service for this, and while it is not there in this initial release we will in the future also support Active Directory. The initial release and Fleet Commander website will be out alongside Fedora Workstation 26. PipeWire I talked about PipeWire before, when it was still called Pinos, but the scope and ambition for the project has significantly changed since then. Last time when I spoke about it the goal was just to create something that could be considered a video equivalent of pulseaudio. Wim Taymans, who you might know co-created GStreamer and who has been a major PulseAudio contributor, has since expanded the scope and PipeWire now aims at unifying linux Audio and Video. The long term the goal is for PipeWire to not only provide handling of video streams, but also handle all kings of audio. Due to this Wim has been spending a lot of time making sure PipeWire can handle audio in a way that not only address the PulseAudio usecases, but also the ones handled by Jack today. A big part of the motivation for this is that we want to make Fedora Workstation the best place to create content and we want the pro-audio crowd to be first class citizens of our desktop. At the same time we don’t want to make this another painful subsystem transition so PipeWire so we will need to ensure that PulseAudio applications can still be run without modification. We expect to start shipping PipeWire with Fedora Workstation 27, but at that point only have it handle video as we need this to both enable good video handling for Flatpak applications through a video portal, but also to provide an API for applications that want to do screen capture under Wayland, like web browser applications offering screen sharing. We will the bring the audio features onboard in subsequent releases as we also try to work with the Jack and PulseAudio communities to make this a joint effort. We are also working now on a proper website for PipeWire. Red Hat developer integration A feature we are quite excited about is the integration of support for the Red Hat developer account system into Fedora. This means that you should be able to create a Red Hat developer account through GNOME Online accounts and once you have that account set up you should be able to easily create Red Hat Enterprise Linux virtual machines or containers on your Fedora system. This is a crucial piece for the developer focus that we want the workstation to have and one that we think will make a lot of developers life easier. We where originally hoping to have this ready for Fedora Workstaton 26, but atm it looks more likely to hit Fedora Workstation 27, but we will keep you up to date as this progresses. Fractional scaling for HiDPI systems Fedora Workstation has been leading the charge in supporting HiDPI on Linux and we hope to build on that with the current work to enable fractional scaling support. Since we introduced HiDPI support we have been improving it step by step, for instance last year we introduced support for dealing with different DPI levels per monitor for Wayland applications. The fractional scaling work will take this a step further. The biggest problem it will resolve is that for certain monitor sizes the current scaling options either left things to small or to big. With the fractional scaling support we will introduce intermediate steps, so that you can scale your interface 1.5x times instead of having to go all the way to 2. The set of technologies we are developing for handling fractional scaling should also allow us to provide better scaling for XWayland applications as it provides us with methods for scaling that doesn’t need direct support from the windowing system or toolkit. GNOME Shell performance Carlos Garnacho has been doing some great work recently improving the general performance of GNOME Shell. This comes on top of his earlier performance work that was very well received. How fast/slow GNOME shell is often a subjective thing, but reducing overhead where we can is never a bad thing. Flatpak building Owen Taylor has been working hard on putting the pieces in place for start large scale Flatpak building in Fedora. You might see a couple of test flatpaks appear in a Fedora Workstation 26 timeframe, but the goal is to have a huge Flatpak catalog ready in time for Fedora Workstation 27. Essentially what we are doing is making it very simple for a Fedora maintainer to build a Flatpak of the application they maintain through the Fedora package building infrastructure and push that Flatpak into a central Flatpak registry. And while of course this is mainly meant to be to the benefit of Fedora users there is of course nothing stopping other distributions from offering these Flatpak packaged applications to their users also. Atomic Workstation Another effort that is marching forward is what we call Atomic Workstation. The idea here is to have an immutable OS image kinda like what you see on for instance Android devices. The advantage to this is that the core of the operating system gets tested and deployed as a unit and the chance of users ending with broken systems decrease significantly as we don’t need to rely on packages getting applied in the correct order or scripts executing as expected on each individual workstation out there. This effort is largely based on the Project Atomic effort, and the end goal here is to have a image based OS install and Flatpak based applications on top of it. If you are very adventerous and/or want to help out with this effort you can get the ISO image installer for Atomic Workstation here. Firmware handling Our Linux Firmware project is still going strong with new features being added and new vendors signing on. As Richard Hughes recently blogged about the latest vendor joining the effort is Logitech who now will upload their firmware into the service so that you can keep your Logitech peripherals updated through it. It is worthwhile pointing out here how we worked with Logitech to make this happen, with Richard working on the special tooling needed and thus reducing the threshold for Logitech to start offering their firmware through the service. We have other vendors we are having similar discussions and collaborations with so expect to see more. At this point I tend to recommend people get a Dell to run Linux, due to their strong support for efforts such as the Linux Firmware Service, but other major vendors are in the final stages of testing so expect more major vendors starting to push firmware updates soon. High Dynamic Range The next big thing in the display technology field is HDR (High Dynamic Range). HDR allows for deeper more vibrant colours and is a feature seen on a lot of new TVs these days and game consoles like the Playstation 4 support it. Computer monitors are appearing on the market too now with this feature, for instance the Dell UP2718Q. We want to ensure Fedora and Linux is a leader here, for the benefit of video and graphics artists using Fedora and Red Hat Enterprise Linux. We are thus kicking of an effort to make sure this technology mature as quickly as possible and be fully supported. We are not the only ones interested in this so we will hopefully be collaborating with our friends at Intel, AMD and NVidia on this. We hope to have the first monitors delivered to our office within a few weeks. Codecs While playback these days have moved to streaming where locally installed codecs are of less importance for the consumption usecase, having a wide selection of codecs available is still important for media editing and creation usecases, so we want you to be able to load a varity of old media files into you video editor for instance. Luckily we are at a crossroads now where a lot of widely used codecs have their essential patents expire (mp3, ac3 and more) while at the same time the industry focus seems to have moved to royalty free codec development moving forward (Opus, VP9, Alliance for Open Media). We have been spending a lot of time with the Red Hat legal team trying to clear these codecs, which resulted in mp3 and AC3 now shipping in Fedora Workstation. We have more codecs on the way though, so this effort is in no way over. My goal is that over the course of this year the situation of software patents being a huge issue when dealing with audio and video codecs on Linux will be considered a thing of the past. I would like to thank the Red Hat legal team for their support on this issue as they have had to spend significant time on it as a big company like Red Hat do need to do our own due diligence when it comes to these things, we can’t just trust statements from random people on the internet that these codecs are now free to ship. Battery life We been looking at this for a while now and hope to be able to start sharing information with users on which laptops they should get that will have good battery life under Fedora. Christian Kellner is now our point man on battery life and he has taken up improving the Battery Bench tool that Owen Taylor wrote some time ago. QtGNOME platform We will have a new version of the QtGNOME platform in Fedora 26. For those of you who have not yet heard of this effort it is a set of themes and tools to ensure that Qt applications runs without any major issues under GNOME 3. With the new version the theming expands to include the accessibility and dark themes in Adwaita, meaning that if you switch to one of these themes under GNOME shell it will also switch your Qt applications over. We are also making sure things like cut’n paste and drag and drop works well. The version in Fedora Workstation 26 is a big step forward for this effort and should hopefully make Qt applications be first class citizens under your Fedora Workstation desktop. Wayland polish Ever since we switched the default to Wayland we have kept the pressure up and kept fixing bugs and finding solutions for corner cases. The result should be an improved Wayland experience in Fedora Workstation 26. A big thanks for Olivier Fourdan, Jonas Ådahl and the whole Wayland community for their continued efforts here. Two major items Jonas is working on for instance is improving fractional scaling, to ensure that your desktop scales to an optimal size on HiDPI displays of various sizes. What we currently have is limited to 1x or 2x, which is either to small or to big for some screens, but with this work you can also do 1.5x scaling. He is also working on preparing an API that will allow screen sharing under Wayland, so that for instance sharing your slides over video conferencing can work under Wayland. [Less]
Posted 6 days ago
Felt it been to long since I did another Fedora Workstation update. We spend a lot of time trying to figure out how we can best spend our resources to produce the best desktop possible for our users, because even though Red Hat invests more into the ... [More] Linux desktop than any other company by quite a margin, our resources are still far from limitless. So we have a continuous effort of asking ourselves if each of the areas we are investing in are the right ones that give our users the things they need the most, so below is a sampling of the things we are working on. Improving integration of the NVidia binary driver This has been ongoing for quite a while, but things have started to land now. Hans de Goede and Simone Caronni has been collaboring, building on the work by NVidia and Adam Jackson around glvnd. So if you set up Simones NVidia repository hosted on negativo17 you will be able to install the Nvidia driver without any conflicts with the Mesa stack and due to Hans work you should be fairly sure that even if the NVidia driver stops working with a given kernel update you will smoothly transition back to the open source Nouveau driver. I been testing it on my own Lenovo P70 system for the last week and it seems to work well under X. That said once you install the binary NVidia driver that is what your running on, which is of course not the behaviour you want from a hybrid graphics system. Fixing that last issue requires further collaboration between us and NVidia. Related to this Adam Jackson is currently working on a project he calls glxmux. glxmux will allow you to have more than one GLX implementation on the system, so that you can switch between Mesa GLX for the Intel integrated graphics card and NVidia GLX for the binary driver. While we can make no promises we hope to have the framework in place for Fedora Workstation 27. Having that in place should allow us to create a solution where you only use the NVidia driver when you want the extra graphics power which will of course require significant work from Nvidia to enable it on their side so I can’t give a definite timeline for when all the puzzle pieces are in place. Just be assured we are working on it and talking regularly to NVidia about it. I will let you know here as soon as things come together. On the Wayland side the Jonas Ådahl is working on putting the final touches on Hybrid Graphics support Fleet Commander ready for take-off Another major project we been working on for a long time in Fleet Commander. Fleet Commander is a tool to allow you to manage Fedora and RHEL desktops centrally. This is a tool targeted at for instance Universities or companies with tens, hundreds or thousands of workstation installation. It gives you a graphical browser based UI (accessible through Cockpit) to create configuration profiles and deploy across your organization. Currently it allows you to control anything that has a gsetting associated with it like enabling/disabling extensions and setting configuration settings in GTK+ and GNOME applications. It allows you to configure Network Manager settings so if your updating the company VPN or proxy settings you can easily push those changes out to all user in the organization. Or quickly migrate Evolution email settings to a new email server. The tool also allows you to control recommended applications in the Software Center and set bookmarks in Firefox. There is also support for controlling settings inside LibreOffice. All this features can be set and controlled on either a user level or a group level or organization wide due to the close integration we have with FreeIPA suite of tools. The data is stored inside your organizations LDAP server alongside other user information so you don’t need to have the clients connect to a new service for this, and while it is not there in this initial release we will in the future also support Active Directory. The initial release and Fleet Commander website will be out alongside Fedora Workstation 26. PipeWire I talked about PipeWire before, when it was still called Pinos, but the scope and ambition for the project has significantly changed since then. Last time when I spoke about it the goal was just to create something that could be considered a video equivalent of pulseaudio. Wim Taymans, who you might know co-created GStreamer and who has been a major PulseAudio contributor, has since expanded the scope and PipeWire now aims at unifying linux Audio and Video. The long term the goal is for PipeWire to not only provide handling of video streams, but also handle all kinds of audio. Due to this Wim has been spending a lot of time making sure PipeWire can handle audio in a way that not only address the PulseAudio usecases, but also the ones handled by Jack today. A big part of the motivation for this is that we want to make Fedora Workstation the best place to create content and we want the pro-audio crowd to be first class citizens of our desktop. At the same time we don’t want to make this another painful subsystem transition so PipeWire so we will need to ensure that PulseAudio applications can still be run without modification. We expect to start shipping PipeWire with Fedora Workstation 27, but at that point only have it handle video as we need this to both enable good video handling for Flatpak applications through a video portal, but also to provide an API for applications that want to do screen capture under Wayland, like web browser applications offering screen sharing. We will the bring the audio features onboard in subsequent releases as we also try to work with the Jack and PulseAudio communities to make this a joint effort. We are also working now on a proper website for PipeWire. Red Hat developer integration A feature we are quite excited about is the integration of support for the Red Hat developer account system into Fedora. This means that you should be able to create a Red Hat developer account through GNOME Online accounts and once you have that account set up you should be able to easily create Red Hat Enterprise Linux virtual machines or containers on your Fedora system. This is a crucial piece for the developer focus that we want the workstation to have and one that we think will make a lot of developers life easier. We where originally hoping to have this ready for Fedora Workstaton 26, but atm it looks more likely to hit Fedora Workstation 27, but we will keep you up to date as this progresses. Fractional scaling for HiDPI systems Fedora Workstation has been leading the charge in supporting HiDPI on Linux and we hope to build on that with the current work to enable fractional scaling support. Since we introduced HiDPI support we have been improving it step by step, for instance last year we introduced support for dealing with different DPI levels per monitor for Wayland applications. The fractional scaling work will take this a step further. The biggest problem it will resolve is that for certain monitor sizes the current scaling options either left things to small or to big. With the fractional scaling support we will introduce intermediate steps, so that you can scale your interface 1.5x times instead of having to go all the way to 2. The set of technologies we are developing for handling fractional scaling should also allow us to provide better scaling for XWayland applications as it provides us with methods for scaling that doesn’t need direct support from the windowing system or toolkit. GNOME Shell performance Carlos Garnacho has been doing some great work recently improving the general performance of GNOME Shell. This comes on top of his earlier performance work that was very well received. How fast/slow GNOME shell is often a subjective thing, but reducing overhead where we can is never a bad thing. Flatpak building Owen Taylor has been working hard on putting the pieces in place for start large scale Flatpak building in Fedora. You might see a couple of test flatpaks appear in a Fedora Workstation 26 timeframe, but the goal is to have a huge Flatpak catalog ready in time for Fedora Workstation 27. Essentially what we are doing is making it very simple for a Fedora maintainer to build a Flatpak of the application they maintain through the Fedora package building infrastructure and push that Flatpak into a central Flatpak registry. And while of course this is mainly meant to be to the benefit of Fedora users there is of course nothing stopping other distributions from offering these Flatpak packaged applications to their users also. Atomic Workstation Another effort that is marching forward is what we call Atomic Workstation. The idea here is to have an immutable OS image kinda like what you see on for instance Android devices. The advantage to this is that the core of the operating system gets tested and deployed as a unit and the chance of users ending with broken systems decrease significantly as we don’t need to rely on packages getting applied in the correct order or scripts executing as expected on each individual workstation out there. This effort is largely based on the Project Atomic effort, and the end goal here is to have a image based OS install and Flatpak based applications on top of it. If you are very adventerous and/or want to help out with this effort you can get the ISO image installer for Atomic Workstation here. Firmware handling Our Linux Firmware project is still going strong with new features being added and new vendors signing on. As Richard Hughes recently blogged about the latest vendor joining the effort is Logitech who now will upload their firmware into the service so that you can keep your Logitech peripherals updated through it. It is worthwhile pointing out here how we worked with Logitech to make this happen, with Richard working on the special tooling needed and thus reducing the threshold for Logitech to start offering their firmware through the service. We have other vendors we are having similar discussions and collaborations with so expect to see more. At this point I tend to recommend people get a Dell to run Linux, due to their strong support for efforts such as the Linux Firmware Service, but other major vendors are in the final stages of testing so expect more major vendors starting to push firmware updates soon. High Dynamic Range The next big thing in the display technology field is HDR (High Dynamic Range). HDR allows for deeper more vibrant colours and is a feature seen on a lot of new TVs these days and game consoles like the Playstation 4 support it. Computer monitors are appearing on the market too now with this feature, for instance the Dell UP2718Q. We want to ensure Fedora and Linux is a leader here, for the benefit of video and graphics artists using Fedora and Red Hat Enterprise Linux. We are thus kicking of an effort to make sure this technology mature as quickly as possible and be fully supported. We are not the only ones interested in this so we will hopefully be collaborating with our friends at Intel, AMD and NVidia on this. We hope to have the first monitors delivered to our office within a few weeks. Codecs While playback these days have moved to streaming where locally installed codecs are of less importance for the consumption usecase, having a wide selection of codecs available is still important for media editing and creation usecases, so we want you to be able to load a varity of old media files into you video editor for instance. Luckily we are at a crossroads now where a lot of widely used codecs have their essential patents expire (mp3, ac3 and more) while at the same time the industry focus seems to have moved to royalty free codec development moving forward (Opus, VP9, Alliance for Open Media). We have been spending a lot of time with the Red Hat legal team trying to clear these codecs, which resulted in mp3 and AC3 now shipping in Fedora Workstation. We have more codecs on the way though, so this effort is in no way over. My goal is that over the course of this year the situation of software patents being a huge issue when dealing with audio and video codecs on Linux will be considered a thing of the past. I would like to thank the Red Hat legal team for their support on this issue as they have had to spend significant time on it as a big company like Red Hat do need to do our own due diligence when it comes to these things, we can’t just trust statements from random people on the internet that these codecs are now free to ship. Battery life We been looking at this for a while now and hope to be able to start sharing information with users on which laptops they should get that will have good battery life under Fedora. Christian Kellner is now our point man on battery life and he has taken up improving the Battery Bench tool that Owen Taylor wrote some time ago. QtGNOME platform We will have a new version of the QtGNOME platform in Fedora 26. For those of you who have not yet heard of this effort it is a set of themes and tools to ensure that Qt applications runs without any major issues under GNOME 3. With the new version the theming expands to include the accessibility and dark themes in Adwaita, meaning that if you switch to one of these themes under GNOME shell it will also switch your Qt applications over. We are also making sure things like cut’n paste and drag and drop works well. The version in Fedora Workstation 26 is a big step forward for this effort and should hopefully make Qt applications be first class citizens under your Fedora Workstation desktop. Wayland polish Ever since we switched the default to Wayland we have kept the pressure up and kept fixing bugs and finding solutions for corner cases. The result should be an improved Wayland experience in Fedora Workstation 26. A big thanks for Olivier Fourdan, Jonas Ådahl and the whole Wayland community for their continued efforts here. Two major items Jonas is working on for instance is improving fractional scaling, to ensure that your desktop scales to an optimal size on HiDPI displays of various sizes. What we currently have is limited to 1x or 2x, which is either to small or to big for some screens, but with this work you can also do 1.5x scaling. He is also working on preparing an API that will allow screen sharing under Wayland, so that for instance sharing your slides over video conferencing can work under Wayland. [Less]
Posted 7 days ago
Introducing casync In the past months I have been working on a new project: casync. casync takes inspiration from the popular rsync file synchronization tool as well as the probably even more popular git revision control system. It combines the idea ... [More] of the rsync algorithm with the idea of git-style content-addressable file systems, and creates a new system for efficiently storing and delivering file system images, optimized for high-frequency update cycles over the Internet. Its current focus is on delivering IoT, container, VM, application, portable service or OS images, but I hope to extend it later in a generic fashion to become useful for backups and home directory synchronization as well (but more about that later). The basic technological building blocks casync is built from are neither new nor particularly innovative (at least not anymore), however the way casync combines them is different from existing tools, and that's what makes it useful for a variety of use-cases that other tools can't cover that well. Why? I created casync after studying how today's popular tools store and deliver file system images. To briefly name a few: Docker has a layered tarball approach, OSTree serves the individual files directly via HTTP and maintains packed deltas to speed up updates, while other systems operate on the block layer and place raw squashfs images (or other archival file systems, such as IS09660) for download on HTTP shares (in the better cases combined with zsync data). Neither of these approaches appeared fully convincing to me when used in high-frequency update cycle systems. In such systems, it is important to optimize towards a couple of goals: Most importantly, make updates cheap traffic-wise (for this most tools use image deltas of some form) Put boundaries on disk space usage on servers (keeping deltas between all version combinations clients might want to run updates between, would suggest keeping an exponentially growing amount of deltas on servers) Put boundaries on disk space usage on clients Be friendly to Content Delivery Networks (CDNs), i.e. serve neither too many small nor too many overly large files, and only require the most basic form of HTTP. Provide the repository administrator with high-level knobs to tune the average file size delivered. Simplicity to use for users, repository administrators and developers I don't think any of the tools mentioned above are really good on more than a small subset of these points. Specifically: Docker's layered tarball approach dumps the "delta" question onto the feet of the image creators: the best way to make your image downloads minimal is basing your work on an existing image clients might already have, and inherit its resources, maintaining full history. Here, revision control (a tool for the developer) is intermingled with update management (a concept for optimizing production delivery). As container histories grow individual deltas are likely to stay small, but on the other hand a brand-new deployment usually requires downloading the full history onto the deployment system, even though there's no use for it there, and likely requires substantially more disk space and download sizes. OSTree's serving of individual files is unfriendly to CDNs (as many small files in file trees cause an explosion of HTTP GET requests). To counter that OSTree supports placing pre-calculated delta images between selected revisions on the delivery servers, which means a certain amount of revision management, that leaks into the clients. Delivering direct squashfs (or other file system) images is almost beautifully simple, but of course means every update requires a full download of the newest image, which is both bad for disk usage and generated traffic. Enhancing it with zsync makes this a much better option, as it can reduce generated traffic substantially at very little cost of history/meta-data (no explicit deltas between a large number of versions need to be prepared server side). On the other hand server requirements in disk space and functionality (HTTP Range requests) are minus points for the use-case I am interested in. (Note: all the mentioned systems have great properties, and it's not my intention to badmouth them. They only point I am trying to make is that for the use case I care about — file system image delivery with high high frequency update-cycles — each system comes with certain drawbacks.) Security & Reproducibility Besides the issues pointed out above I wasn't happy with the security and reproducibility properties of these systems. In today's world where security breaches involving hacking and breaking into connected systems happen every day, an image delivery system that cannot make strong guarantees regarding data integrity is out of date. Specifically, the tarball format is famously nondeterministic: the very same file tree can result in any number of different valid serializations depending on the tool used, its version and the underlying OS and file system. Some tar implementations attempt to correct that by guaranteeing that each file tree maps to exactly one valid serialization, but such a property is always only specific to the tool used. I strongly believe that any good update system must guarantee on every single link of the chain that there's only one valid representation of the data to deliver, that can easily be verified. What casync Is So much about the background why I created casync. Now, let's have a look what casync actually is like, and what it does. Here's the brief technical overview: Encoding: Let's take a large linear data stream, split it into variable-sized chunks (the size of each being a function of the chunk's contents), and store these chunks in individual, compressed files in some directory, each file named after a strong hash value of its contents, so that the hash value may be used to as key for retrieving the full chunk data. Let's call this directory a "chunk store". At the same time, generate a "chunk index" file that lists these chunk hash values plus their respective chunk sizes in a simple linear array. The chunking algorithm is supposed to create variable, but similarly sized chunks from the data stream, and do so in a way that the same data results in the same chunks even if placed at varying offsets. For more information see this blog story. Decoding: Let's take the chunk index file, and reassemble the large linear data stream by concatenating the uncompressed chunks retrieved from the chunk store, keyed by the listed chunk hash values. As an extra twist, we introduce a well-defined, reproducible, random-access serialization format for file trees (think: a more modern tar), to permit efficient, stable storage of complete file trees in the system, simply by serializing them and then passing them into the encoding step explained above. Finally, let's put all this on the network: for each image you want to deliver, generate a chunk index file and place it on an HTTP server. Do the same with the chunk store, and share it between the various index files you intend to deliver. Why bother with all of this? Streams with similar contents will result in mostly the same chunk files in the chunk store. This means it is very efficient to store many related versions of a data stream in the same chunk store, thus minimizing disk usage. Moreover, when transferring linear data streams chunks already known on the receiving side can be made use of, thus minimizing network traffic. Why is this different from rsync or OSTree, or similar tools? Well, one major difference between casync and those tools is that we remove file boundaries before chunking things up. This means that small files are lumped together with their siblings and large files are chopped into pieces, which permits us to recognize similarities in files and directories beyond file boundaries, and makes sure our chunk sizes are pretty evenly distributed, without the file boundaries affecting them. The "chunking" algorithm is based on a the buzhash rolling hash function. SHA256 is used as strong hash function to generate digests of the chunks. xz is used to compress the individual chunks. Here's a diagram, hopefully explaining a bit how the encoding process works, wasn't it for my crappy drawing skills: The diagram shows the encoding process from top to bottom. It starts with a block device or a file tree, which is then serialized and chunked up into variable sized blocks. The compressed chunks are then placed in the chunk store, while a chunk index file is written listing the chunk hashes in order. (The original SVG of this graphic may be found here.) Details Note that casync operates on two different layers, depending on the use-case of the user: You may use it on the block layer. In this case the raw block data on disk is taken as-is, read directly from the block device, split into chunks as described above, compressed, stored and delivered. You may use it on the file system layer. In this case, the file tree serialization format mentioned above comes into play: the file tree is serialized depth-first (much like tar would do it) and then split into chunks, compressed, stored and delivered. The fact that it may be used on both the block and file system layer opens it up for a variety of different use-cases. In the VM and IoT ecosystems shipping images as block-level serializations is more common, while in the container and application world file-system-level serializations are more typically used. Chunk index files referring to block-layer serializations carry the .caibx suffix, while chunk index files referring to file system serializations carry the .caidx suffix. Note that you may also use casync as direct tar replacement, i.e. without the chunking, just generating the plain linear file tree serialization. Such files carry the .catar suffix. Internally .caibx are identical to .caidx files, the only difference is semantical: .caidx files describe a .catar file, while .caibx files may describe any other blob. Finally, chunk stores are directories carrying the .castr suffix. Features Here are a couple of other features casync has: When downloading a new image you may use casync's --seed= feature: each block device, file, or directory specified is processed using the same chunking logic described above, and is used as preferred source when putting together the downloaded image locally, avoiding network transfer of it. This of course is useful whenever updating an image: simply specify one or more old versions as seed and only download the chunks that truly changed since then. Note that using seeds requires no history relationship between seed and the new image to download. This has major benefits: you can even use it to speed up downloads of relatively foreign and unrelated data. For example, when downloading a container image built using Ubuntu you can use your Fedora host OS tree in /usr as seed, and casync will automatically use whatever it can from that tree, for example timezone and locale data that tends to be identical between distributions. Example: casync extract http://example.com/myimage.caibx --seed=/dev/sda1 /dev/sda2. This will place the block-layer image described by the indicated URL in the /dev/sda2 partition, using the existing /dev/sda1 data as seeding source. An invocation like this could be typically used by IoT systems with an A/B partition setup. Example 2: casync extract http://example.com/mycontainer-v3.caidx --seed=/srv/container-v1 --seed=/srv/container-v2 /src/container-v3, is very similar but operates on the file system layer, and uses two old container versions to seed the new version. When operating on the file system level, the user has fine-grained control on the meta-data included in the serialization. This is relevant since different use-cases tend to require a different set of saved/restored meta-data. For example, when shipping OS images, file access bits/ACLs and ownership matter, while file modification times hurt. When doing personal backups OTOH file ownership matters little but file modification times are important. Moreover different backing file systems support different feature sets, and storing more information than necessary might make it impossible to validate a tree against an image if the meta-data cannot be replayed in full. Due to this, casync provides a set of --with= and --without= parameters that allow fine-grained control of the data stored in the file tree serialization, including the granularity of modification times and more. The precise set of selected meta-data features is also always part of the serialization, so that seeding can work correctly and automatically. casync tries to be as accurate as possible when storing file system meta-data. This means that besides the usual baseline of file meta-data (file ownership and access bits), and more advanced features (extended attributes, ACLs, file capabilities) a number of more exotic data is stored as well, including Linux chattr(1) file attributes, as well as FAT file attributes (you may wonder why the latter? — EFI is FAT, and /efi is part of the comprehensive serialization of any host). In the future I intend to extend this further, for example storing btrfs sub-volume information where available. Note that as described above every single type of meta-data may be turned off and on individually, hence if you don't need FAT file bits (and I figure it's pretty likely you don't), then they won't be stored. The user creating .caidx or .caibx files may control the desired average chunk length (before compression) freely, using the --chunk-size= parameter. Smaller chunks increase the number of generated files in the chunk store and increase HTTP GET load on the server, but also ensure that sharing between similar images is improved, as identical patterns in the images stored are more likely to be recognized. By default casync will use a 64K average chunk size. Tweaking this can be particularly useful when adapting the system to specific CDNs, or when delivering compressed disk images such as squashfs (see below). Emphasis is placed on making all invocations reproducible, well-defined and strictly deterministic. As mentioned above this is a requirement to reach the intended security guarantees, but is also useful for many other use-cases. For example, the casync digest command may be used to calculate a hash value identifying a specific directory in all desired detail (use --with= and --without to pick the desired detail). Moreover the casync mtree command may be used to generate a BSD mtree(5) compatible manifest of a directory tree, .caidx or .catar file. The file system serialization format is nicely composable. By this I mean that the serialization of a file tree is the concatenation of the serializations of all files and file sub-trees located at the top of the tree, with zero meta-data references from any of these serializations into the others. This property is essential to ensure maximum reuse of chunks when similar trees are serialized. When extracting file trees or disk image files, casync will automatically create reflinks from any specified seeds if the underlying file system supports it (such as btrfs, ocfs, and future xfs). After all, instead of copying the desired data from the seed, we can just tell the file system to link up the relevant blocks. This works both when extracting .caidx and .caibx files — the latter of course only when the extracted disk image is placed in a regular raw image file on disk, rather than directly on a plain block device, as plain block devices do not know the concept of reflinks. Optionally, when extracting file trees, casync can create traditional UNIX hard-links for identical files in specified seeds (--hardlink=yes). This works on all UNIX file systems, and can save substantial amounts of disk space. However, this only works for very specific use-cases where disk images are considered read-only after extraction, as any changes made to one tree will propagate to all other trees sharing the same hard-linked files, as that's the nature of hard-links. In this mode, casync exposes OSTree-like behavior, which is built heavily around read-only hard-link trees. casync tries to be smart when choosing what to include in file system images. Implicitly, file systems such as procfs and sysfs are excluded from serialization, as they expose API objects, not real files. Moreover, the "nodump" (+d) chattr(1) flag is honored by default, permitting users to mark files to exclude from serialization. When creating and extracting file trees casync may apply an automatic or explicit UID/GID shift. This is particularly useful when transferring container image for use with Linux user name-spacing. In addition to local operation, casync currently supports HTTP, HTTPS, FTP and ssh natively for downloading chunk index files and chunks (the ssh mode requires installing casync on the remote host, though, but an sftp mode not requiring that should be easy to add). When creating index files or chunks, only ssh is supported as remote back-end. When operating on block-layer images, you may expose locally or remotely stored images as local block devices. Example: casync mkdev http://example.com/myimage.caibx exposes the disk image described by the indicated URL as local block device in /dev, which you then may use the usual block device tools on, such as mount or fdisk (only read-only though). Chunks are downloaded on access with high priority, and at low priority when idle in the background. Note that in this mode, casync also plays a role similar to "dm-verity", as all blocks are validated against the strong digests in the chunk index file before passing them on to the kernel's block layer. This feature is implemented though Linux' NBD kernel facility. Similar, when operating on file-system-layer images, you may mount locally or remotely stored images as regular file systems. Example: casync mount http://example.com/mytree.caidx /srv/mytree mounts the file tree image described by the indicated URL as a local directory /srv/mytree. This feature is implemented though Linux' FUSE kernel facility. Note that special care is taken that the images exposed this way can be packed up again with casync make and are guaranteed to return the bit-by-bit exact same serialization again that it was mounted from. No data is lost or changed while passing things through FUSE (OK, strictly speaking this is a lie, we do lose ACLs, but that's hopefully just a temporary gap to be fixed soon). In IoT A/B fixed size partition setups the file systems placed in the two partitions are usually much shorter than the partition size, in order to keep some room for later, larger updates. casync is able to analyze the super-block of a number of common file systems in order to determine the actual size of a file system stored on a block device, so that writing a file system to such a partition and reading it back again will result in reproducible data. Moreover this speeds up the seeding process, as there's little point in seeding the white-space after the file system within the partition. Example Command Lines Here's how to use casync, explained with a few examples: $ casync make foobar.caidx /some/directory This will create a chunk index file foobar.caidx in the local directory, and populate the chunk store directory default.castr located next to it with the chunks of the serialization (you can change the name for the store directory with --store= if you like). This command operates on the file-system level. A similar command operating on the block level: $ casync make foobar.caibx /dev/sda1 This command creates a chunk index file foobar.caibx in the local directory describing the current contents of the /dev/sda1 block device, and populates default.castr in the same way as above. Note that you may as well read a raw disk image from a file instead of a block device: $ casync make foobar.caibx myimage.raw To reconstruct the original file tree from the .caidx file and the chunk store of the first command, use: $ casync extract foobar.caidx /some/other/directory And similar for the block-layer version: $ casync extract foobar.caibx /dev/sdb1 or, to extract the block-layer version into a raw disk image: $ casync extract foobar.caibx myotherimage.raw The above are the most basic commands, operating on local data only. Now let's make this more interesting, and reference remote resources: $ casync extract http://example.com/images/foobar.caidx /some/other/directory This extracts the specified .caidx onto a local directory. This of course assumes that foobar.caidx was uploaded to the HTTP server in the first place, along with the chunk store. You can use any command you like to accomplish that, for example scp or rsync. Alternatively, you can let casync do this directly when generating the chunk index: $ casync make ssh.example.com:images/foobar.caidx /some/directory This will use ssh to connect to the ssh.example.com server, and then places the .caidx file and the chunks on it. Note that this mode of operation is "smart": this scheme will only upload chunks currently missing on the server side, and not re-transmit what already is available. Note that you can always configure the precise path or URL of the chunk store via the --store= option. If you do not do that, then the store path is automatically derived from the path or URL: the last component of the path or URL is replaced by default.castr. Of course, when extracting .caidx or .caibx files from remote sources, using a local seed is advisable: $ casync extract http://example.com/images/foobar.caidx --seed=/some/exising/directory /some/other/directory Or on the block layer: $ casync extract http://example.com/images/foobar.caibx --seed=/dev/sda1 /dev/sdb2 When creating chunk indexes on the file system layer casync will by default store meta-data as accurately as possible. Let's create a chunk index with reduced meta-data: $ casync make foobar.caidx --with=sec-time --with=symlinks --with=read-only /some/dir This command will create a chunk index for a file tree serialization that has three features above the absolute baseline supported: 1s granularity time-stamps, symbolic links and a single read-only bit. In this mode, all the other meta-data bits are not stored, including nanosecond time-stamps, full UNIX permission bits, file ownership or even ACLs or extended attributes. Now let's make a .caidx file available locally as a mounted file system, without extracting it: $ casync mount http://example.comf/images/foobar.caidx /mnt/foobar And similar, let's make a .caibx file available locally as a block device: $ casync mkdev http://example.comf/images/foobar.caibx This will create a block device in /dev and print the used device node path to STDOUT. As mentioned, casync is big about reproducibility. Let's make use of that to calculate the a digest identifying a very specific version of a file tree: $ casync digest . This digest will include all meta-data bits casync and the underlying file system know about. Usually, to make this useful you want to configure exactly what meta-data to include: $ casync digest --with=unix . This makes use of the --with=unix shortcut for selecting meta-data fields. Specifying --with-unix= selects all meta-data that traditional UNIX file systems support. It is a shortcut for writing out: --with=16bit-uids --with=permissions --with=sec-time --with=symlinks --with=device-nodes --with=fifos --with=sockets. Note that when calculating digests or creating chunk indexes you may also use the negative --without= option to remove specific features but start from the most precise: $ casync digest --without=flag-immutable This generates a digest with the most accurate meta-data, but leaves one feature out: chattr(1)'s immutable (+i) file flag. To list the contents of a .caidx file use a command like the following: $ casync list http://example.com/images/foobar.caidx or $ casync mtree http://example.com/images/foobar.caidx The former command will generate a brief list of files and directories, not too different from tar t or ls -al in its output. The latter command will generate a BSD mtree(5) compatible manifest. Note that casync actually stores substantially more file meta-data than mtree files can express, though. What casync isn't casync is not an attempt to minimize serialization and downloaded deltas to the extreme. Instead, the tool is supposed to find a good middle ground, that is good on traffic and disk space, but not at the price of convenience or requiring explicit revision control. If you care about updates that are absolutely minimal, there are binary delta systems around that might be an option for you, such as Google's Courgette. casync is not a replacement for rsync, or git or zsync or anything like that. They have very different use-cases and semantics. For example, rsync permits you to directly synchronize two file trees remotely. casync just cannot do that, and it is unlikely it every will. Where next? casync is supposed to be a generic synchronization tool. Its primary focus for now is delivery of OS images, but I'd like to make it useful for a couple other use-cases, too. Specifically: To make the tool useful for backups, encryption is missing. I have pretty concrete plans how to add that. When implemented, the tool might become an alternative to restic, BorgBackup or tarsnap. Right now, if you want to deploy casync in real-life, you still need to validate the downloaded .caidx or .caibx file yourself, for example with some gpg signature. It is my intention to integrate with gpg in a minimal way so that signing and verifying chunk index files is done automatically. In the longer run, I'd like to build an automatic synchronizer for $HOME between systems from this. Each $HOME instance would be stored automatically in regular intervals in the cloud using casync, and conflicts would be resolved locally. casync is written in a shared library style, but it is not yet built as one. Specifically this means that almost all of casync's functionality is supposed to be available as C API soon, and applications can process casync files on every level. It is my intention to make this library useful enough so that it will be easy to write a module for GNOME's gvfs subsystem in order to make remote or local .caidx files directly available to applications (as an alternative to casync mount). In fact the idea is to make this all flexible enough that even the remoting back-ends can be replaced easily, for example to replace casync's default HTTP/HTTPS back-ends built on CURL with GNOME's own HTTP implementation, in order to share cookies, certificates, … There's also an alternative method to integrate with casync in place already: simply invoke casync as a sub-process. casync will inform you about a certain set of state changes using a mechanism compatible with sd_notify(3). In future it will also propagate progress data this way and more. I intend to a add a new seeding back-end that sources chunks from the local network. After downloading the new .caidx file off the Internet casync would then search for the listed chunks on the local network first before retrieving them from the Internet. This should speed things up on all installations that have multiple similar systems deployed in the same network. Further plans are listed tersely in the TODO file. FAQ: Is this a systemd project? — casync is hosted under the github systemd umbrella, and the projects share the same coding style. However, the code-bases are distinct and without interdependencies, and casync works fine both on systemd systems and systems without it. Is casync portable? — At the moment: no. I only run Linux and that's what I code for. That said, I am open to accepting portability patches (unlike for systemd, which doesn't really make sense on non-Linux systems), as long as they don't interfere too much with the way casync works. Specifically this means that I am not too enthusiastic about merging portability patches for OSes lacking the openat(2) family of APIs. Does casync require reflink-capable file systems to work, such as btrfs? — No it doesn't. The reflink magic in casync is employed when the file system permits it, and it's good to have it, but it's not a requirement, and casync will implicitly fall back to copying when it isn't available. Note that casync supports a number of file system features on a variety of file systems that aren't available everywhere, for example FAT's system/hidden file flags or xfs's projinherit file flag. Is casync stable? — I just tagged the first, initial release. While I have been working on it since quite some time and it is quite featureful, this is the first time I advertise it publicly, and it hence received very little testing outside of its own test suite. I am also not fully ready to commit to the stability of the current serialization or chunk index format. I don't see any breakages coming for it though. casync is pretty light on documentation right now, and does not even have a man page. I also intend to correct that soon. Are the .caidx/.caibx and .catar file formats open and documented? — casync is Open Source, so if you want to know the precise format, have a look at the sources for now. It's definitely my intention to add comprehensive docs for both formats however. Don't forget this is just the initial version right now. casync is just like $SOMEOTHERTOOL! Why are you reinventing the wheel (again)? — Well, because casync isn't "just like" some other tool. I am pretty sure I did my homework, and that there is no tool just like casync right now. The tools coming closest are probably rsync, zsync, tarsnap, restic, but they are quite different beasts each. Why did you invent your own serialization format for file trees? Why don't you just use tar? — That's a good question, and other systems — most prominently tarsnap — do that. However, as mentioned above tar doesn't enforce reproducibility. It also doesn't really do random access: if you want to access some specific file you need to read every single byte stored before it in the tar archive to find it, which is of course very expensive. The serialization casync implements places a focus on reproducibility, random access, and meta-data control. Much like traditional tar it can still be generated and extracted in a stream fashion though. Does casync save/restore SELinux/SMACK file labels? — At the moment not. That's not because I wouldn't want it to, but simply because I am not a guru of either of these systems, and didn't want to implement something I do not fully grok nor can test. If you look at the sources you'll find that there's already some definitions in place that keep room for them though. I'd be delighted to accept a patch implementing this fully. What about delivering squashfs images? How well does chunking work on compressed serializations? – That's a very good point! Usually, if you apply the a chunking algorithm to a compressed data stream (let's say a tar.gz file), then changing a single bit at the front will propagate into the entire remainder of the file, so that minimal changes will explode into major changes. Thankfully this doesn't apply that strictly to squashfs images, as it provides random access to files and directories and thus breaks up the compression streams in regular intervals to make seeking easy. This fact is beneficial for systems employing chunking, such as casync as this means single bit changes might affect their vicinity but will not explode in an unbounded fashion. In order achieve best results when delivering squashfs images through casync the block sizes of squashfs and the chunks sizes of casync should be matched up (using casync's --chunk-size= option). How precisely to choose both values is left a research subject for the user, for now. What does the name casync mean? – It's a synchronizing tool, hence the -sync suffix, following rsync's naming. It makes use of the content-addressable concept of git hence the ca- prefix. Where can I get this stuff? Is it already packaged? – Check out the sources on GitHub. I just tagged the first version. Martin Pitt has packaged casync for Ubuntu. There is also an ArchLinux package. Zbigniew Jędrzejewski-Szmek has prepared a Fedora RPM that hopefully will soon be included in the distribution. Should you care? Is this a tool for you? Well, that's up to you really. If you are involved with projects that need to deliver IoT, VM, container, application or OS images, then maybe this is a great tool for you — but other options exist, some of which are linked above. Note that casync is an Open Source project: if it doesn't do exactly what you need, prepare a patch that adds what you need, and we'll consider it. If you are interested in the project and would like to talk about this in person, I'll be presenting casync soon at Kinvolk's Linux Technologies Meetup in Berlin, Germany. You are invited. I also intend to talk about it at All Systems Go!, also in Berlin. [Less]
Posted 7 days ago
Introducing casync In the past months I have been working on a new project: casync. casync takes inspiration from the popular rsync file synchronization tool as well as the probably even more popular git revision control system. It combines the idea ... [More] of the rsync algorithm with the idea of git-style content-addressable file systems, and creates a new system for efficiently storing and delivering file system images, optimized for high-frequency update cycles over the Internet. Its current focus is on delivering IoT, container, VM, application, portable service or OS images, but I hope to extend it later in a generic fashion to become useful for backups and home directory synchronization as well (but more about that later). The basic technological building blocks casync is built from are neither new nor particularly innovative (at least not anymore), however the way casync combines them is different from existing tools, and that's what makes it useful for a variety of use-cases that other tools can't cover that well. Why? I created casync after studying how today's popular tools store and deliver file system images. To briefly name a few: Docker has a layered tarball approach, OSTree serves the individual files directly via HTTP and maintains packed deltas to speed up updates, while other systems operate on the block layer and place raw squashfs images (or other archival file systems, such as IS09660) for download on HTTP shares (in the better cases combined with zsync data). Neither of these approaches appeared fully convincing to me when used in high-frequency update cycle systems. In such systems, it is important to optimize towards a couple of goals: Most importantly, make updates cheap traffic-wise (for this most tools use image deltas of some form) Put boundaries on disk space usage on servers (keeping deltas between all version combinations clients might want to run updates between, would suggest keeping an exponentially growing amount of deltas on servers) Put boundaries on disk space usage on clients Be friendly to Content Delivery Networks (CDNs), i.e. serve neither too many small nor too many overly large files, and only require the most basic form of HTTP. Provide the repository administrator with high-level knobs to tune the average file size delivered. Simplicity to use for users, repository administrators and developers I don't think any of the tools mentioned above are really good on more than a small subset of these points. Specifically: Docker's layered tarball approach dumps the "delta" question onto the feet of the image creators: the best way to make your image downloads minimal is basing your work on an existing image clients might already have, and inherit its resources, maintaining full history. Here, revision control (a tool for the developer) is intermingled with update management (a concept for optimizing production delivery). As container histories grow individual deltas are likely to stay small, but on the other hand a brand-new deployment usually requires downloading the full history onto the deployment system, even though there's no use for it there, and likely requires substantially more disk space and download sizes. OSTree's serving of individual files is unfriendly to CDNs (as many small files in file trees cause an explosion of HTTP GET requests). To counter that OSTree supports placing pre-calculated delta images between selected revisions on the delivery servers, which means a certain amount of revision management, that leaks into the clients. Delivering direct squashfs (or other file system) images is almost beautifully simple, but of course means every update requires a full download of the newest image, which is both bad for disk usage and generated traffic. Enhancing it with zsync makes this a much better option, as it can reduce generated traffic substantially at very little cost of history/meta-data (no explicit deltas between a large number of versions need to be prepared server side). On the other hand server requirements in disk space and functionality (HTTP Range requests) are minus points for the use-case I am interested in. (Note: all the mentioned systems have great properties, and it's not my intention to badmouth them. They only point I am trying to make is that for the use case I care about — file system image delivery with high high frequency update-cycles — each system comes with certain drawbacks.) Security & Reproducibility Besides the issues pointed out above I wasn't happy with the security and reproducibility properties of these systems. In today's world where security breaches involving hacking and breaking into connected systems happen every day, an image delivery system that cannot make strong guarantees regarding data integrity is out of date. Specifically, the tarball format is famously nondeterministic: the very same file tree can result in any number of different valid serializations depending on the tool used, its version and the underlying OS and file system. Some tar implementations attempt to correct that by guaranteeing that each file tree maps to exactly one valid serialization, but such a property is always only specific to the tool used. I strongly believe that any good update system must guarantee on every single link of the chain that there's only one valid representation of the data to deliver, that can easily be verified. What casync Is So much about the background why I created casync. Now, let's have a look what casync actually is like, and what it does. Here's the brief technical overview: Encoding: Let's take a large linear data stream, split it into variable-sized chunks (the size of each being a function of the chunk's contents), and store these chunks in individual, compressed files in some directory, each file named after a strong hash value of its contents, so that the hash value may be used to as key for retrieving the full chunk data. Let's call this directory a "chunk store". At the same time, generate a "chunk index" file that lists these chunk hash values plus their respective chunk sizes in a simple linear array. The chunking algorithm is supposed to create variable, but similarly sized chunks from the data stream, and do so in a way that the same data results in the same chunks even if placed at varying offsets. For more information see this blog story. Decoding: Let's take the chunk index file, and reassemble the large linear data stream by concatenating the uncompressed chunks retrieved from the chunk store, keyed by the listed chunk hash values. As an extra twist, we introduce a well-defined, reproducible, random-access serialization format for file trees (think: a more modern tar), to permit efficient, stable storage of complete file trees in the system, simply by serializing them and then passing them into the encoding step explained above. Finally, let's put all this on the network: for each image you want to deliver, generate a chunk index file and place it on an HTTP server. Do the same with the chunk store, and share it between the various index files you intend to deliver. Why bother with all of this? Streams with similar contents will result in mostly the same chunk files in the chunk store. This means it is very efficient to store many related versions of a data stream in the same chunk store, thus minimizing disk usage. Moreover, when transferring linear data streams chunks already known on the receiving side can be made use of, thus minimizing network traffic. Why is this different from rsync or OSTree, or similar tools? Well, one major difference between casync and those tools is that we remove file boundaries before chunking things up. This means that small files are lumped together with their siblings and large files are chopped into pieces, which permits us to recognize similarities in files and directories beyond file boundaries, and makes sure our chunk sizes are pretty evenly distributed, without the file boundaries affecting them. The "chunking" algorithm is based on a the buzhash rolling hash function. SHA256 is used as strong hash function to generate digests of the chunks. xz is used to compress the individual chunks. Here's a diagram, hopefully explaining a bit how the encoding process works, wasn't it for my crappy drawing skills: The diagram shows the encoding process from top to bottom. It starts with a block device or a file tree, which is then serialized and chunked up into variable sized blocks. The compressed chunks are then placed in the chunk store, while a chunk index file is written listing the chunk hashes in order. (The original SVG of this graphic may be found here.) Details Note that casync operates on two different layers, depending on the use-case of the user: You may use it on the block layer. In this case the raw block data on disk is taken as-is, read directly from the block device, split into chunks as described above, compressed, stored and delivered. You may use it on the file system layer. In this case, the file tree serialization format mentioned above comes into play: the file tree is serialized depth-first (much like tar would do it) and then split into chunks, compressed, stored and delivered. The fact that it may be used on both the block and file system layer opens it up for a variety of different use-cases. In the VM and IoT ecosystems shipping images as block-level serializations is more common, while in the container and application world file-system-level serializations are more typically used. Chunk index files referring to block-layer serializations carry the .caibx suffix, while chunk index files referring to file system serializations carry the .caidx suffix. Note that you may also use casync as direct tar replacement, i.e. without the chunking, just generating the plain linear file tree serialization. Such files carry the .catar suffix. Internally .caibx are identical to .caidx files, the only difference is semantical: .caidx files describe a .catar file, while .caibx files may describe any other blob. Finally, chunk stores are directories carrying the .castr suffix. Features Here are a couple of other features casync has: When downloading a new image you may use casync's --seed= feature: each block device, file, or directory specified is processed using the same chunking logic described above, and is used as preferred source when putting together the downloaded image locally, avoiding network transfer of it. This of course is useful whenever updating an image: simply specify one or more old versions as seed and only download the chunks that truly changed since then. Note that using seeds requires no history relationship between seed and the new image to download. This has major benefits: you can even use it to speed up downloads of relatively foreign and unrelated data. For example, when downloading a container image built using Ubuntu you can use your Fedora host OS tree in /usr as seed, and casync will automatically use whatever it can from that tree, for example timezone and locale data that tends to be identical between distributions. Example: casync extract http://example.com/myimage.caibx --seed=/dev/sda1 /dev/sda2. This will place the block-layer image described by the indicated URL in the /dev/sda2 partition, using the existing /dev/sda1 data as seeding source. An invocation like this could be typically used by IoT systems with an A/B partition setup. Example 2: casync extract http://example.com/mycontainer-v3.caidx --seed=/srv/container-v1 --seed=/srv/container-v2 /src/container-v3, is very similar but operates on the file system layer, and uses two old container versions to seed the new version. When operating on the file system level, the user has fine-grained control on the meta-data included in the serialization. This is relevant since different use-cases tend to require a different set of saved/restored meta-data. For example, when shipping OS images, file access bits/ACLs and ownership matter, while file modification times hurt. When doing personal backups OTOH file ownership matters little but file modification times are important. Moreover different backing file systems support different feature sets, and storing more information than necessary might make it impossible to validate a tree against an image if the meta-data cannot be replayed in full. Due to this, casync provides a set of --with= and --without= parameters that allow fine-grained control of the data stored in the file tree serialization, including the granularity of modification times and more. The precise set of selected meta-data features is also always part of the serialization, so that seeding can work correctly and automatically. casync tries to be as accurate as possible when storing file system meta-data. This means that besides the usual baseline of file meta-data (file ownership and access bits), and more advanced features (extended attributes, ACLs, file capabilities) a number of more exotic data is stored as well, including Linux chattr(1) file attributes, as well as FAT file attributes (you may wonder why the latter? — EFI is FAT, and /efi is part of the comprehensive serialization of any host). In the future I intend to extend this further, for example storing btrfs sub-volume information where available. Note that as described above every single type of meta-data may be turned off and on individually, hence if you don't need FAT file bits (and I figure it's pretty likely you don't), then they won't be stored. The user creating .caidx or .caibx files may control the desired average chunk length (before compression) freely, using the --chunk-size= parameter. Smaller chunks increase the number of generated files in the chunk store and increase HTTP GET load on the server, but also ensure that sharing between similar images is improved, as identical patterns in the images stored are more likely to be recognized. By default casync will use a 64K average chunk size. Tweaking this can be particularly useful when adapting the system to specific CDNs, or when delivering compressed disk images such as squashfs (see below). Emphasis is placed on making all invocations reproducible, well-defined and strictly deterministic. As mentioned above this is a requirement to reach the intended security guarantees, but is also useful for many other use-cases. For example, the casync digest command may be used to calculate a hash value identifying a specific directory in all desired detail (use --with= and --without to pick the desired detail). Moreover the casync mtree command may be used to generate a BSD mtree(5) compatible manifest of a directory tree, .caidx or .catar file. The file system serialization format is nicely composable. By this I mean that the serialization of a file tree is the concatenation of the serializations of all files and file sub-trees located at the top of the tree, with zero meta-data references from any of these serializations into the others. This property is essential to ensure maximum reuse of chunks when similar trees are serialized. When extracting file trees or disk image files, casync will automatically create reflinks from any specified seeds if the underlying file system supports it (such as btrfs, ocfs, and future xfs). After all, instead of copying the desired data from the seed, we can just tell the file system to link up the relevant blocks. This works both when extracting .caidx and .caibx files — the latter of course only when the extracted disk image is placed in a regular raw image file on disk, rather than directly on a plain block device, as plain block devices do not know the concept of reflinks. Optionally, when extracting file trees, casync can create traditional UNIX hard-links for identical files in specified seeds (--hardlink=yes). This works on all UNIX file systems, and can save substantial amounts of disk space. However, this only works for very specific use-cases where disk images are considered read-only after extraction, as any changes made to one tree will propagate to all other trees sharing the same hard-linked files, as that's the nature of hard-links. In this mode, casync exposes OSTree-like behavior, which is built heavily around read-only hard-link trees. casync tries to be smart when choosing what to include in file system images. Implicitly, file systems such as procfs and sysfs are excluded from serialization, as they expose API objects, not real files. Moreover, the "nodump" (+d) chattr(1) flag is honored by default, permitting users to mark files to exclude from serialization. When creating and extracting file trees casync may apply an automatic or explicit UID/GID shift. This is particularly useful when transferring container image for use with Linux user name-spacing. In addition to local operation, casync currently supports HTTP, HTTPS, FTP and ssh natively for downloading chunk index files and chunks (the ssh mode requires installing casync on the remote host, though, but an sftp mode not requiring that should be easy to add). When creating index files or chunks, only ssh is supported as remote back-end. When operating on block-layer images, you may expose locally or remotely stored images as local block devices. Example: casync mkdev http://example.com/myimage.caibx exposes the disk image described by the indicated URL as local block device in /dev, which you then may use the usual block device tools on, such as mount or fdisk (only read-only though). Chunks are downloaded on access with high priority, and at low priority when idle in the background. Note that in this mode, casync also plays a role similar to "dm-verity", as all blocks are validated against the strong digests in the chunk index file before passing them on to the kernel's block layer. This feature is implemented though Linux' NBD kernel facility. Similar, when operating on file-system-layer images, you may mount locally or remotely stored images as regular file systems. Example: casync mount http://example.com/mytree.caidx /srv/mytree mounts the file tree image described by the indicated URL as a local directory /srv/mytree. This feature is implemented though Linux' FUSE kernel facility. Note that special care is taken that the images exposed this way can be packed up again with casync make and are guaranteed to return the bit-by-bit exact same serialization again that it was mounted from. No data is lost or changed while passing things through FUSE (OK, strictly speaking this is a lie, we do lose ACLs, but that's hopefully just a temporary gap to be fixed soon). In IoT A/B fixed size partition setups the file systems placed in the two partitions are usually much shorter than the partition size, in order to keep some room for later, larger updates. casync is able to analyze the super-block of a number of common file systems in order to determine the actual size of a file system stored on a block device, so that writing a file system to such a partition and reading it back again will result in reproducible data. Moreover this speeds up the seeding process, as there's little point in seeding the white-space after the file system within the partition. Example Command Lines Here's how to use casync, explained with a few examples: $ casync make foobar.caidx /some/directory This will create a chunk index file foobar.caidx in the local directory, and populate the chunk store directory default.castr located next to it with the chunks of the serialization (you can change the name for the store directory with --store= if you like). This command operates on the file-system level. A similar command operating on the block level: $ casync make foobar.caibx /dev/sda1 This command creates a chunk index file foobar.caibx in the local directory describing the current contents of the /dev/sda1 block device, and populates default.castr in the same way as above. Note that you may as well read a raw disk image from a file instead of a block device: $ casync make foobar.caibx myimage.raw To reconstruct the original file tree from the .caidx file and the chunk store of the first command, use: $ casync extract foobar.caidx /some/other/directory And similar for the block-layer version: $ casync extract foobar.caibx /dev/sdb1 or, to extract the block-layer version into a raw disk image: $ casync extract foobar.caibx myotherimage.raw The above are the most basic commands, operating on local data only. Now let's make this more interesting, and reference remote resources: $ casync extract http://example.com/images/foobar.caidx /some/other/directory This extracts the specified .caidx onto a local directory. This of course assumes that foobar.caidx was uploaded to the HTTP server in the first place, along with the chunk store. You can use any command you like to accomplish that, for example scp or rsync. Alternatively, you can let casync do this directly when generating the chunk index: $ casync make ssh.example.com:images/foobar.caidx /some/directory This will use ssh to connect to the ssh.example.com server, and then places the .caidx file and the chunks on it. Note that this mode of operation is "smart": this scheme will only upload chunks currently missing on the server side, and not re-transmit what already is available. Note that you can always configure the precise path or URL of the chunk store via the --store= option. If you do not do that, then the store path is automatically derived from the path or URL: the last component of the path or URL is replaced by default.castr. Of course, when extracting .caidx or .caibx files from remote sources, using a local seed is advisable: $ casync extract http://example.com/images/foobar.caidx --seed=/some/exising/directory /some/other/directory Or on the block layer: $ casync extract http://example.com/images/foobar.caibx --seed=/dev/sda1 /dev/sdb2 When creating chunk indexes on the file system layer casync will by default store meta-data as accurately as possible. Let's create a chunk index with reduced meta-data: $ casync make foobar.caidx --with=sec-time --with=symlinks --with=read-only /some/dir This command will create a chunk index for a file tree serialization that has three features above the absolute baseline supported: 1s granularity time-stamps, symbolic links and a single read-only bit. In this mode, all the other meta-data bits are not stored, including nanosecond time-stamps, full UNIX permission bits, file ownership or even ACLs or extended attributes. Now let's make a .caidx file available locally as a mounted file system, without extracting it: $ casync mount http://example.comf/images/foobar.caidx /mnt/foobar And similar, let's make a .caibx file available locally as a block device: $ casync mkdev http://example.comf/images/foobar.caibx This will create a block device in /dev and print the used device node path to STDOUT. As mentioned, casync is big about reproducibility. Let's make use of that to calculate the a digest identifying a very specific version of a file tree: $ casync digest . This digest will include all meta-data bits casync and the underlying file system know about. Usually, to make this useful you want to configure exactly what meta-data to include: $ casync digest --with=unix . This makes use of the --with=unix shortcut for selecting meta-data fields. Specifying --with-unix= selects all meta-data that traditional UNIX file systems support. It is a shortcut for writing out: --with=16bit-uids --with=permissions --with=sec-time --with=symlinks --with=device-nodes --with=fifos --with=sockets. Note that when calculating digests or creating chunk indexes you may also use the negative --without= option to remove specific features but start from the most precise: $ casync digest --without=flag-immutable This generates a digest with the most accurate meta-data, but leaves one feature out: chattr(1)'s immutable (+i) file flag. To list the contents of a .caidx file use a command like the following: $ casync list http://example.com/images/foobar.caidx or $ casync mtree http://example.com/images/foobar.caidx The former command will generate a brief list of files and directories, not too different from tar t or ls -al in its output. The latter command will generate a BSD mtree(5) compatible manifest. Note that casync actually stores substantially more file meta-data than mtree files can express, though. What casync isn't casync is not an attempt to minimize serialization and downloaded deltas to the extreme. Instead, the tool is supposed to find a good middle ground, that is good on traffic and disk space, but not at the price of convenience or requiring explicit revision control. If you care about updates that are absolutely minimal, there are binary delta systems around that might be an option for you, such as Google's Courgette. casync is not a replacement for rsync, or git or zsync or anything like that. They have very different use-cases and semantics. For example, rsync permits you to directly synchronize two file trees remotely. casync just cannot do that, and it is unlikely it every will. Where next? casync is supposed to be a generic synchronization tool. Its primary focus for now is delivery of OS images, but I'd like to make it useful for a couple other use-cases, too. Specifically: To make the tool useful for backups, encryption is missing. I have pretty concrete plans how to add that. When implemented, the tool might become an alternative to restic, Borg or tarsnap. Right now, if you want to deploy casync in real-life, you still need to validate the downloaded .caidx or .caibx file yourself, for example with some gpg signature. It is my intention to integrate with gpg in a minimal way so that signing and verifying chunk index files is done automatically. In the longer run, I'd like to build an automatic synchronizer for $HOME between systems from this. Each $HOME instance would be stored automatically in regular intervals in the cloud using casync, and conflicts would be resolved locally. casync is written in a shared library style, but it is not yet built as one. Specifically this means that almost all of casync's functionality is supposed to be available as C API soon, and applications can process casync files on every level. It is my intention to make this library useful enough so that it will be easy to write a module for GNOME's gvfs subsystem in order to make remote or local .caidx files directly available to applications (as an alternative to casync mount). In fact the idea is to make this all flexible enough that even the remoting back-ends can be replaced easily, for example to replace casync's default HTTP/HTTPS back-ends built on CURL with GNOME's own HTTP implementation, in order to share cookies, certificates, … There's also an alternative method to integrate with casync in place already: simply invoke casync as a sub-process. casync will inform you about a certain set of state changes using a mechanism compatible with sd_notify(3). In future it will also propagate progress data this way and more. I intend to a add a new seeding back-end that sources chunks from the local network. After downloading the new .caidx file off the Internet casync would then search for the listed chunks on the local network first before retrieving them from the Internet. This should speed things up on all installations that have multiple similar systems deployed in the same network. Further plans are listed tersely in the TODO file. FAQ: Is this a systemd project? — casync is hosted under the github systemd umbrella, and the projects share the same coding style. However, the code-bases are distinct and without interdependencies, and casync works fine both on systemd systems and systems without it. Is casync portable? — At the moment: no. I only run Linux and that's what I code for. That said, I am open to accepting portability patches (unlike for systemd, which doesn't really make sense on non-Linux systems), as long as they don't interfere too much with the way casync works. Specifically this means that I am not too enthusiastic about merging portability patches for OSes lacking the openat(2) family of APIs. Does casync require reflink-capable file systems to work, such as btrfs? No it doesn't. The reflink magic in casync is employed when the file system permits it, and it's good to have it, but it's not a requirement, and casync will implicitly fall back to copying when it isn't available. Note that casync supports a number of file system features on a variety of file systems that aren't available everywhere, for example FAT's system/hidden file flags or xfs's projinherit file flag. Is casync stable? — I just tagged the first, initial release. While I have been working on it since quite some time and it is quite featureful, this is the first time I advertise it publicly, and it hence received very little testing outside of its own test suite. I am also not fully ready to commit to the stability of the current serialization or chunk index format. I don't see any breakages coming for it though. casync is pretty light on documentation right now, and does not even have a man page. I also intend to correct that soon. Are the .caidx/.caibx and .catar file formats open and documented? — casync is Open Source, so if you want to know the precise format, have a look at the sources for now. It's definitely my intention to add comprehensive docs for both formats however. Don't forget this is just the initial version right now. casync is just like $SOMEOTHERTOOL! Why are you reinventing the wheel (again)? — Well, because casync isn't "just like" some other tool. I am pretty sure I did my homework, and that there is no tool just like casync right now. The tools coming closest are probably rsync, zsync, tarsnap, restic, but they are quite different beasts each. Why did you invent your own serialization format for file trees? Why don't you just use tar? That's a good question, and other systems — most prominently tarsnap — do that. However, as mentioned above tar doesn't enforce reproducibility. It also doesn't really do random access: if you want to access some specific file you need to read every single byte stored before it in the tar archive to find it, which is of course very expensive. The serialization casync implements places a focus on reproducibility, random access, and meta-data control. Much like traditional tar it can still be generated and extracted in a stream fashion though. Does casync save/restore SELinux/SMACK file labels? At the moment not. That's not because I wouldn't want it to, but simply because I am not a guru of either of these systems, and didn't want to implement something I do not fully grok nor can test. If you look at the sources you'll find that there's already some definitions in place that keep room for them though. I'd be delighted to accept a patch implementing this fully. What about delivering squashfs images? How well does chunking work on compressed serializations? – That's a very good point! Usually, if you apply the a chunking algorithm to a compressed data stream (let's say a tar.gz file), then changing a single bit at the front will propagate into the entire remainder of the file, so that minimal changes will explode into major changes. Thankfully this doesn't apply that strictly to squashfs images, as it provides random access to files and directories and thus breaks up the compression streams in regular intervals to make seeking easy. This fact is beneficial for systems employing chunking, such as casync as this means single bit changes might affect their vicinity but will not explode in an unbounded fashion. In order achieve best results when delivering squashfs images through casync the block sizes of squashfs and the chunks sizes of casync should be matched up (using casync's --chunk-size= option). How precisely to choose both values is left a research subject for the user, for now. What does the name casync mean? – It's a synchronizing tool, hence the -sync suffix, following rsync's naming. It makes use of the content-addressable concept of git hence the ca- prefix. Where can I get this stuff? Is it already packaged? – Check out the sources on GitHub. I just tagged the first version. Martin Pitt has packaged casync for Ubuntu. There is also an ArchLinux package. Zbigniew Jędrzejewski-Szmek has prepared a Fedora RPM that hopefully will soon be included in the distribution. Should you care? Is this a tool for you? Well, that's up to you really. If you are involved with projects that need to deliver IoT, VM, container, application or OS images, then maybe this is a great tool for you — but other options exist, some of which are linked above. Note that casync is an Open Source project: if it doesn't do exactly what you need, prepare a patch that adds what you need, and we'll consider it. If you are interested in the project and would like to talk about this in person, I'll be presenting casync soon at Kinvolk's Linux Technologies Meetup in Berlin, Germany. You are invited. I also intend to talk about it at All Systems Go!, also in Berlin. [Less]
Posted 7 days ago
Introducing casync In the past months I have been working on a new project: casync. casync takes inspiration from the popular rsync file synchronization tool as well as the probably even more popular git revision control system. It combines the idea ... [More] of the rsync algorithm with the idea of git-style content-addressable file systems, and creates a new system for efficiently storing and delivering file system images, optimized for high-frequency update cycles over the Internet. Its current focus is on delivering IoT, container, VM, application, portable service or OS images, but I hope to extend it later in a generic fashion to become useful for backups and home directory synchronization as well (but more about that later). The basic technological building blocks casync is built from are neither new nor particularly innovative (at least not anymore), however the way casync combines them is different from existing tools, and that's what makes it useful for a variety of usecases that other tools can't cover that well. Why? I created casync after studying how today's popular tools store and deliver file system images. To very incomprehensively and briefly name a few: Docker has a layered tarball approach, OSTree serves the individual files directly via HTTP and maintains packed deltas to speed up updates, while other systems operate on the block layer and place raw squashfs images (or other archival file systems, such as IS09660) for download on HTTP shares (in the better cases combined with zsync data). Neither of these approaches appeared fully convincing to me when used in high-frequency update cycle systems. In such systems, it is important to optimize towards a couple of goals: Most importantly, make updates cheap traffic-wise (for this most tools use image deltas of some form) Put boundaries on disk space usage on servers (keeping deltas between all version combinations clients might want to run updates between, would suggest keeping an exponentially growing amount of deltas on servers) Put boundaries on disk space usage on clients Be friendly to Content Delivery Networks (CDNs), i.e. serve neither too many small nor too many overly large files, and only require the most basic form of HTTP. Provide the repository administrator with high-level knobs to tune the average file size delivered. Simplicity to use for users, repository administrators and developers I don't think any of the tools mentioned above are really good on more than a small subset of these points. Specifically: Docker's layered tarball approach dumps the "delta" question onto the feet of the image creators: the best way to make your image downloads minimal is basing your work on an existing image clients might already have, and inherit its resources, maintaing full history. Here, revision control (a tool for the developer) is intermingled with update management (a concept for optimizing production delivery). As container histories grow individual deltas are likely to stay small, but on the other hand a brand-new deployment usually requires downloading the full history onto the deployment system, even though there's no use for it there, and likely requires substantially more disk space and download sizes. OSTree's serving of individual files is unfriendly to CDNs (as many small files in file trees cause an explosion of HTTP GET requests). To counter that OSTree supports placing pre-calculated delta images between selected revisions on the delivery servers, which means a certain amount of revision management, that leaks into the clients. Delivering direct squashfs (or other file system) images is almost beautifully simple, but of course means every update requires a full download of the newest image, which is both bad for disk usage and generated traffic. Enhancing it with zsync makes this a much better option, as it can reduce generated traffic substantially at very little cost of history/metadata (no explicit deltas between a large number of versions need to be prepared server side). On the other hand server requirements in disk space and functionality (HTTP Range requests) are minus points for the usecase I am interested in. (Note: all the mentioned systems have great properties, and it's not my intention to badmouth them. They only point I am trying to make is that for the use case I care about — file system image delivery with high high frequency update-cycles — each system comes with certain drawbacks.) Security & Reproducibility Besides the issues pointed out above I wasn't happy with the security and reproducibility properties of these systems. In today's world where security breaches involving hacking and breaking into connected systems happen every day, an image delivery system that cannot make strong guarantees regarding data integrity is out of date. Specifically, the tarball format is famously undeterministic: the very same file tree can result in any number of different valid serializations depending on the tool used, its version and the underlying OS and file system. Some tar implementations attempt to correct that by guaranteeing that each file tree maps to exactly one valid serialization, but such a property is always only specific to the tool used. I strongly believe that any good update system must guarantee on every single link of the chain that there's only one valid representatin of the data to deliver, that can easily be verified. What casync Is So much about the background why I created casync. Now, let's have a look what casync actually is like, and what it does. Here's the brief technical overview: Encoding: Let's take a large linear data stream, split it into variable-sized chunks (the size of each being a function of the chunk's contents), and store these chunks in individual, compressed files in some directory, each file named after a strong hash value of its contents, so that the hash value may be used to as key for retrieving the full chunk data. Let's call this directory a "chunk store". At the same time, generate a "chunk index" file that lists these chunk hash values plus their respective chunk sizes in a simple linear array. The chunking algorithm is supposed to create variable, but similarly sized chunks from the data stream, and do so in a way that the same data results in the same chunks even if placed at varying offsets. For more information see this blog story. Decoding: Let's take the chunk index file, and reassemble the large linear data stream by concatenating the uncompressed chunks retrieved from the chunk store, keyed by the listed chunk hash values. As an extra twist, we introduce a well-defined, reproducible, random-access serialization format for file trees (think: a more modern tar), to permit efficient, stable storage of complete file trees in the system, simply by serializing them and then passing them into the encoding step explained above. Finally, let's put all this on the network: for each image you want to deliver, generate a chunk index file and place it on an HTTP server. Do the same with the chunk store, and share it between the various index files you intend to deliver. Why bother with all of this? Streams with similar contents will result in mostly the same chunk files in the chunk store. This means it is very efficient to store many related versions of a data stream in the same chunk store, thus minimizing disk usage. Moreover, when transferring linear data streams chunks already known on the receiving side can be made use of, thus minimizing network traffic. Why is this different from rsync or OSTree, or similar tools? Well, one major difference between casync and those tools is that we remove file boundaries before chunking things up. This means that small files are lumped together with their siblings and large files are chopped into pieces, which permits us to recognize similarities in files and directories beyond file boundaries, and makes sure our chunk sizes are pretty evenly distributed, without the file boundaries affecting them. The "chunking" algorithm is based on a the buzhash rolling hash function. SHA256 is used as strong hash function to generate digests of the chunks. xz is used to compress the individual chunks. Here's a diagram, hopefully explaining a bit how the encoding process works, wasn't it for my crappy drawing skills: The diagram shows the encoding process from top to bottom. It starts with a block device or a file tree, which is then serialized and chunked up into variable sized blocks. The compressed chunks are then placed in the chunk store, while a chunk index file is written listing the chunk hashes in order. (The original SVG of this graphic may be found here. Details Note that casync operates on two different layers, depending on the usecase of the user: You may use it on the block layer. In this case the raw block data on disk is taken as-is, read directly from the block device, split into chunks as described above, compressed, stored and delivered. You may use it on the file system layer. In this case, the file tree serialization format mentioned above comes into play: the file tree is serialized depth-first (much like tar would do it) and then split into chunks, compressed, stored and delivered. The fact that it may be used on both the block and file system layer opens it up for a variety of different usecases. In the VM and IoT ecosystems shipping images as block-level serializations is more common, while in the container and application world file-system-level serializations are more typically used. Chunk index files referring to block-layer serializations carry the .caibx suffix, while chunk index files referring to file system serializations carry the .caidx suffix. Note that you may also use casync as direct tar replacement, i.e. without the chunking, just generating the plain linear file tree serialization. Such files carry the .catar suffix. Internally .caibx are identical to .caidx files, the only difference is semantical: .caidx files describe a .catar file, while .caibx files may describe any other blob. Finally, chunk stores are directories carrying the .castr suffix. Features Here are a couple of other features casync has: When downloading a new image you may use casync's --seed= feature: each block device, file, or directory specified is processed using the same chunking logic described above, and is used as preferred source when putting together the downloaded image locally, avoiding network transfer of it. This of course is useful whenever updating an image: simply specify one or more old versions as seed and only download the chunks that truly changed since then. Note that using seeds requires no history relationship between seed and the new image to download. This has major benefits: you can even use it to speed up downloads of relatively foreign and unrelated data. For example, when downloading a container image built using Ubuntu you can use your Fedora host OS tree in /usr as seed, and casync will automatically use whatever it can from that tree, for example timezone and locale data that tends to be identical between distributions. Example: casync extract http://example.com/myimage.caibx --seed=/dev/sda1 /dev/sda2. This will place the block-layer image described by the indicated URL in the /dev/sda2 partition, using the existing /dev/sda1 data as seeding source. An invocation like this could be typically used by IoT systems with an A/B partition setup. Example 2: casync extract http://example.com/mycontainer-v3.caidx --seed=/srv/container-v1 --seed=/srv/container-v2 /src/container-v3, is very similar but operates on the file system layer, and uses two old container versions to seed the new version. When operating on the file system level, the user has fine-grained control on the metadata included in the serialization. This is relevant since different usecases tend to require a different set of saved/restored metadata. For example, when shipping OS images, file access bits/ACLs and ownership matter, while file modification times hurt. When doing personal backups OTOH file ownership matters little but file modification times are important. Moreover different backing file systems support different feature sets, and storing more information than necessary might make it impossible to validate a tree against an image if the metadata cannot be replayed in full. Due to this, casync provides a set of --with= and --without= parameters that allow fine-grained control of the data stored in the file tree serialization, including the granularity of modification times and more. The precise set of selected metadata features is also always part of the serialization, so that seeding can work correctly and automatically. casync tries to be as accurate as possible when storing file system metadata. This means that besides the usual baseline of file metadata (file ownership and access bits), and more advanced features (extended attributes, ACLs, file capabilities) a number of more exotic data is stored as well, including Linux chattr(1) file attributes, as well as FAT file attributes (you may wonder why the latter? — EFI is FAT, and /efi is part of the comprehensive serialization of any host). In the future I intend to extend this further, for example storing btrfs subvolume information where available. Note that as described above every single type of metadata may be turned off and on individually, hence if you don't need FAT file bits (and I figure it's pretty likely you don't), then they won't be stored. The user creating .caidx or .caibx files may control the desired average chunk length (before compression) freely, using the --chunk-size= parameter. Smaller chunks increase the number of generated files in the chunk store and increase HTTP GET load on the server, but also ensure that sharing between similar images is improved, as identical patterns in the images stored are more likely to be recognized. By default casync will use a 64K average chunk size. Tweaking this can be particularly useful when adapting the system to specific CDNs, or when delivering compressed disk images such as squashfs (see below). Emphasis is placed on making all invocations reproducible, well-defined and strictly deterministic. As mentioned above this is a requirement to reach the intended security guarantees, but is also useful for many other usecases. For example, the casync digest command may be used to calculate a hash value identifying a specific directory in all desired detail (use --with= and --without to pick the desired detail). Moreover the casync mtree command may be used to generate a BSD mtree(5) compatible manifest of a directory tree, .caidx or .catar file. The file system serialization format is nicely composable. By this I mean that the serialization of a file tree is the concatenation of the serializations of all files and file subtrees located at the top of the tree, with zero metadata references from any of these serializations into the others. This property is essential to ensure maximum reuse of chunks when similar trees are serialized. When extracting file trees or disk image files, casync will automatically create reflinks from any specified seeds if the underlying file system supports it (such as btrfs, ocfs, and future xfs). After all, instead of copying the desired data from the seed, we can just tell the file system to link up the relevant blocks. This works both when extracting .caidx and .caibx files — the latter of course only when the extracted disk image is placed in a regular raw image file on disk, rather than directly on a plain block device, as plain block devices do not know the concept of reflinks. Optionally, when extracting file trees, casync can create traditional UNIX hardlinks for identical files in specified seeds (--hardlink=yes). This works on all UNIX file systems, and can save substantial amounts of disk space. However, this only works for very specific usecases where disk images are considered read-only after extraction, as any changes made to one tree will propagate to all other trees sharing the same hardlinked files, as that's the nature of hardlinks. In this mode, casync exposes OSTree-like behaviour, which is built heavily around read-only hardlink trees. casync tries to be smart when choosing what to include in file system images. Implicitly, file systems such as procfs and sysfs are exluded from serialization, as they expose API objects, not real files. Moreover, the "nodump" (+d) chattr(1) flag is honoured by default, permitting users to mark files to exclude from serialization. When creating and extracting file trees casync may apply an automatic or explicit UID/GID shift. This is particularly useful when transferring container image for use with Linux user namespacing. In addition to local operation, casync currently supports HTTP, HTTPS, FTP and ssh natively for downloading chunk index files and chunks (the ssh mode requires installing casync on the remote host, though, but an sftp mode not requiring that should be easy to add). When creating index files or chunks, only ssh is supported as remote backend. When operating on block-layer images, you may expose locally or remotely stored images as local block devices. Example: casync mkdev http://example.com/myimage.caibx exposes the disk image described by the indicated URL as local block device in /dev, which you then may use the usual block device tools on, such as mount or fdisk (only read-only though). Chunks are downloaded on access with high priority, and at low priority when idle in the background. Note that in this mode, casync also plays a role similar to "dm-verity", as all blocks are validated against the strong digests in the chunk index file before passing them on to the kernel's block layer. This feature is implemented though Linux' NBD kernel facility. Similar, when operating on file-system-layer images, you may mount locally or remotely stored images as regular file systems. Example: casync mount http://example.com/mytree.caidx /srv/mytree mounts the file tree image described by the indicated URL as a local directory /srv/mytree. This feature is implemented though Linux' FUSE kernel facility. Note that special care is taken that the images exposed this way can be packed up again with casync make and are guaranteed to return the bit-by-bit exact same serialization again that it was mounted from. No data is lost or changed while passing things through FUSE (OK, strictly speaking this is a lie, we do lose ACLs, but that's hopefully just a temporary gap to be fixed soon). In IoT A/B fixed size partition setups the file systems placed in the two partitions are usually much shorter than the partition size, in order to keep some room for later, larger updates. casync is able to analyze the superblock of a number of common file systems in order to determine the actual size of a file system stored on a block device, so that writing a file system to such a partition and reading it back again will result in reproducible data. Moreover this speeds up the seeding process, as there's little point in seeding the whitespace after the file system within the partition. Example Command Lines Here's how to use casync, explained with a few examples: $ casync make foobar.caidx /some/directory This will create a chunk index file foobar.caidx in the local directory, and populate the chunk store directory default.castr located next to it with the chunks of the serialization (you can change the name for the store directory with --store= if you like). This command operates on the file-system level. A similar command operating on the block level: $ casync make foobar.caibx /dev/sda1 This command creates a chunk index file foobar.caibx in the local directory describing the current contents of the /dev/sda1 block device, and populates default.castr in the same way as above. Note that you may as well read a raw disk image from a file instead of a block device: $ casync make foobar.caibx myimage.raw To reconstruct the original file tree from the .caidx file and the chunk store of the first command, use: $ casync extract foobar.caidx /some/other/directory And similar for the block-layer version: $ casync extract foobar.caibx /dev/sdb1 or, to extract the block-layer version into a raw disk image: $ casync extract foobar.caibx myotherimage.raw The above are the most basic commands, operating on local data only. Now let's make this more interesting, and reference remote resources: $ casync extract http://example.com/images/foobar.caidx /some/other/directory This extracts the specified .caidx onto a local directory. This of course assumes that foobar.caidx was uploaded to the HTTP server in the first place, along with the chunk store. You can use any command you like to accomplish that, for example scp or rsync. Alternatively, you can let casync do this directly when generating the chunk index: $ casync make ssh.example.com:images/foobar.caidx /some/directory This will use ssh to connect to the ssh.example.com server, and then places the .caidx file and the chunks on it. Note that this mode of operation is "smart": this scheme will only upload chunks currently missing on the server side, and not retransmit what already is available. Note that you can always configure the precise path or URL of the chunk store via the --store= option. If you do not do that, then the store path is automatically derived from the path or URL: the last component of the path or URL is replaced by default.castr. Of course, when extracting .caidx or .caibx files from remote sources, using a local seed is advisable: $ casync extract http://example.com/images/foobar.caidx --seed=/some/exising/directory /some/other/directory Or on the block layer: $ casync extract http://example.com/images/foobar.caibx --seed=/dev/sda1 /dev/sdb2 When creating chunk indexes on the file system layer casync will by default store metadata as accurately as possible. Let's create a chunk index with reduced metadata: $ casync make foobar.caidx --with=sec-time --with=symlinks --with=read-only /some/dir This command will create a chunk index for a file tree serialization that has three features above the absolute baseline supported: 1s granularity timestamps, symbolic links and a single read-only bit. In this mode, all the other metadata bits are not stored, including nanosecond timestamps, full unix permission bits, file ownership or even ACLs or extended attributes. Now let's make a .caidx file available locally as a mounted file system, without extracting it: $ casync mount http://example.comf/images/foobar.caidx /mnt/foobar And similar, let's make a .caibx file available locally as a block device: $ casync mkdev http://example.comf/images/foobar.caibx This will create a block device in /dev and print the used device node path to STDOUT. As mentioned, casync is big about reproducibility. Let's make use of that to calculate the a digest identifying a very specific version of a file tree: $ casync digest . This digest will include all metadata bits casync and the underlying file system know about. Usually, to make this useful you want to configure exactly what metadata to include: $ casync digest --with=unix . This makes use of the --with=unix shortcut for selecting metadata fields. Specifying --with-unix= selects all metadata that traditional UNIX file systems support. It is a shortcut for writing out: --with=16bit-uids --with=permissions --with=sec-time --with=symlinks --with=device-nodes --with=fifos --with=sockets. Note that when calculating digests or creating chunk indexes you may also use the negative --without= option to remove specific features but start from the most precise: $ casync digest --without=flag-immutable This generates a digest with the most accurate metadata, but leaves one feature out: chattr(1)'s immutable (+i) file flag. To list the contents of a .caidx file use a command like the following: $ casync list http://example.com/images/foobar.caidx or $ casync mtree http://example.com/images/foobar.caidx The former command will generate a brief list of files and directories, not too different from tar t or ls -al in its output. The latter command will generate a BSD mtree(5) compatible manifest. Note that casync actually stores substantially more file metadata than mtree files can express, though. What casync isn't casync is not an attempt to minimize serialization and downloaded deltas to the extreme. Instead, the tool is supposed to find a good middle ground, that is good on traffic and disk space, but not at the price of convenience or requiring explicit revision control. If you care about updates that are absolutely minimal, there are binary delta systems around that might be an option for you, such as Google's Courgette. casync is not a replacement for rsync, or git or zsync or anything like that. They have very different usecases and semantics. For example, rsync permits you to directly synchronize two file trees remotely. casync just cannot do that, and it is unlikely it every will. Where next? casync is supposed to be a generic synchronization tool. Its primary focus for now is delivery of OS images, but I'd like to make it useful for a couple other usecases, too. Specifically: To make the tool useful for backups, encryption is missing. I have pretty concrete plans how to add that. When implemented, the tool would might become an alternative to restic or tarsnap. Right now, if you want to deploy casync in real-life, you still need to validate the downloaded .caidx or .caibx file yourself, for example with some gpg signature. It is my intention to integrate with gpg in a minimal way so that signing and verifying chunk index files is done automatically. In the longer run, I'd like to build an automatic synchronizer for $HOME between systems from this. Each $HOME instance would be stored automatically in regular intervals in the cloud using casync, and conflicts would be resolved locally. casync is written in a shared library style, but it is not yet built as one. Specifically this means that almost all of casync's functionality is supposed to be available as C API soon, and applications can process casync files on every level. It is my intention to make this library useful enough so that it will be easy to write a module for GNOME's gvfs subsystem in order to make remote or local .caidx files directly available to applications (as an alternative to casync mount). In fact the idea is to make this all flexible enough that even the remoting backends can be replaced easily, for example to replace casync's default HTTP/HTTPS backends built on CURL with GNOME's own HTTP implementation, in order to share cookies, certificates, … There's also an alternative method to integrate with casync in place already: simply invoke casync as a subprocess. casync will inform you about a certain set of state changes using a mechanism compatible with sd_notify(3). In future it will also propagate progress data this way and more. I intend to a add a new seeding back-end that sources chunks from the local network. After downloading the new .caidx file off the Internet casync would then search for the listed chunks on the local network first before retrieving them from the Internet. This should speed things up on all installations that have multiple similar systems deployed in the same network. Further plans are listed tersely in the TODO file. FAQ: Is this a systemd project? — casync is hosted under the github systemd umbrella, and the projects share the same coding style. However, the codebases are distinct and without interdependencies, and casync works fine both on systemd systems and systems without it. Is casync portable? — At the moment: no. I only run Linux and that's what I code for. That said, I am open to accepting portability patches (unlike for systemd, which doesn't really make sense on non-Linux systems), as long as they don't interfere too much with the way casync works. Specifically this means that I am not too enthusiastic about merging portability patches for OSes lacking the openat(2) family of APIs. Does casync require reflink-capable file systems to work, such as btrfs? No it doesn't. The reflink magic in casync is employed when the file system permits it, and it's good to have it, but it's not a requirement, and casync will implicitly fall back to copying when it isn't available. Note that casync supports a number of file system features on a variety of file systems that aren't available everywhere, for example FAT's system/hidden file flags or xfs's projinherit file flag. Is casync stable? — I just tagged the first, initial release. While I have been working on it since quite some time and it is quite featureful, this is the first time I advertise it publicly, and it hence received very little testing outside of its own test suite. I am also not fully ready to commit to the stability of the current serialization or chunk index format. I don't see any breakages coming for it though. casync is pretty light on documentation right now, and does not even have a man page. I also intend to correct that soon. Are the .caidx/.caibx and .catar file formats open and documented? — casync is Open Source, so if you want to know the precise format, have a look at the sources for now. It's definitely my intention to add comprehensive docs for both formats however. Don't forget this is just the initial version right now. casync is just like $SOMEOTHERTOOL! Why are you reinventing the wheel (again)? — Well, because casync isn't "just like" some other tool. I am pretty sure I did my homework, and that there is no tool just like casync right now. The tools coming closest are probably rsync, zsync, tarsnap, restic, but they are quite different beasts each. Why did you invent your own serialization format for file trees? Why don't you just use tar? That's a good question, and other systems — most prominently tarsnap — do that. However, as mentioned above tar doesn't enforce reproducibility. It also doesn't really do random access: if you want to access some specific file you need to read every single byte stored before it in the tar archive to find it, which is of course very expensive. The serialization casync implements places a focus on reproducibility, random access, and metadata control. Much like traditional tar it can still be generated and extracted in a stream fashion though. Does casync save/restore SELinux/SMACK file labels? At the moment not. That's not because I wouldn't want it to, but simply because I am not a guru of either of these systems, and didn't want to implement something I do not fully grok nor can test. If you look at the sources you'll find that there's already some definitions in place that keep room for them though. I'd be delighted to accept a patch implementing this fully. What about delivering squashfs images? How well does chunking work on compressed serializations? – That's a very good point! Usually, if you apply the a chunking algorithm to a compressed data stream (let's say a tar.gz file), then changing a single bit at the front will propagate into the entire remainder of the file, so that minimal changes will explode into major changes. Thankfully this doesn't apply that strictly to squashfs images, as it provides random access to files and directories and thus breaks up the compression streams in regular intervals to make seeking easy. This fact is beneficial for systems employing chunking, such as casync as this means single bit changes might affect their vicinity but will not explode in an unbounded fashion. In order achieve best results when delivering squashfs images through casync the block sizes of squashfs and the chunks sizes of casync should be matched up (using casync's --chunk-size= option). How precisely to choose both values is left a research subject for the user, for now. What does the name casync mean? – It's a synchronizing tool, hence the -sync suffix, following rsync's naming. It makes use of the content-addressable concept of git hence the ca- prefix. Where can I get this stuff? Is it already packaged? – Check out the sources on GitHub. I just tagged the first version. Martin Pitt has packaged casync for Ubuntu. There is also an ArchLinux package. Zbigniew Jędrzejewski-Szmek has prepared a Fedora RPM that hopefully will soon be included in the distribution. Should you care? Is this a tool for you? Well, that's up to you really. If you are involved with projects that need to deliver IoT, VM, container, application or OS images, then maybe this is a great tool for you — but other options exist, some of which are linked above. Note that casync is an Open Source project: if it doesn't do exactly what you need, prepare a patch that adds what you need, and we'll consider it. If you are interested in the project and would like to talk about this in person, I'll be presenting casync soon at Kinvolk's Linux Technologies Meetup in Berlin, Germany. You are invited. I also intend to talk about it at All Systems Go!, also in Berlin. [Less]
Posted 7 days ago
The All Systems Go! 2017 Call for Participation is Now Open! We’d like to invite presentation proposals for All Systems Go! 2017! All Systems Go! is an Open Source community conference focused on the projects and technologies at the foundation of ... [More] modern Linux systems — specifically low-level user-space technologies. Its goal is to provide a friendly and collaborative gathering place for individuals and communities working to push these technologies forward. All Systems Go! 2017 takes place in Berlin, Germany on October 21st+22nd. All Systems Go! is a 2-day event with 2-3 talks happening in parallel. Full presentation slots are 30-45 minutes in length and lightning talk slots are 5-10 minutes. We are now accepting submissions for presentation proposals. In particular, we are looking for sessions including, but not limited to, the following topics: Low-level container executors and infrastructure IoT and embedded OS infrastructure OS, container, IoT image delivery and updating Building Linux devices and applications Low-level desktop technologies Networking System and service management Tracing and performance measuring IPC and RPC systems Security and Sandboxing While our focus is definitely more on the user-space side of things, talks about kernel projects are welcome too, as long as they have a clear and direct relevance for user-space. Please submit your proposals by September 3rd. Notification of acceptance will be sent out 1-2 weeks later. To submit your proposal now please visit our CFP submission web site. For further information about All Systems Go! visit our conference web site. systemd.conf will not take place this year in lieu of All Systems Go!. All Systems Go! welcomes all projects that contribute to Linux user space, which, of course, includes systemd. Thus, anything you think was appropriate for submission to systemd.conf is also fitting for All Systems Go! [Less]
Posted 8 days ago
I just released the first release candidate for libinput 1.8. Aside from the build system switch to meson one of the more visible things is that the helper tools have switched from a "libinput-some-tool" to the "libinput some-tool" approach (note the ... [More] space). This is similar to what git does so it won't take a lot of adjustment for most developers. The actual tools are now hiding in /usr/libexec/libinput. This gives us a lot more flexibility in writing testing and debugging tools and shipping them to the users without cluttering up the default PATH. There are two potential breakages here, one is that the two existing tools libinput-debug-events and libinput-list-devices have been renamed too. We currently ship compatibility wrappers for those but expect those wrappers to go away with future releases. The second breakage is of lesser impact: typing "man libinput" used to bring up the man page for the xf86-input-libinput driver. Now it brings up the man page for the libinput tool which then points to the man pages of the various features. That's probably a good thing, it puts the documentation a bit closer to the user. For the driver, you now have to type "man 4 libinput" though. [Less]
Posted 8 days ago
Last week I got review on the new interface for tiling for vc4, and landed it in the kernel. I’ve submitted Mesa patches to the list, landed the kernel backport in Raspbian, and have sent Simon (the package maintainer at the Foundation) the Mesa ... [More] patchset to backport all of my work from the last 6 months (unfortunately Raspbian generally doesn’t update Mesa past the Debian stable release, so graphics work for them has long delays). I also worked with Dom at the Foundation to get an interface for the firmwarekms mode to do tiled scanout, so it will be as fast as the open source driver. I also reworked the official 7” touchscreen panel driver again. I had initially written it as a DSI panel driver, then got told it should be a bridge driver plus a panel driver, then the bridge maintainers told me it should be only one panel driver but attach to I2C. Maybe this version will stick, who knows. My Mesa side for pl111 is now in master, so I’ve got the second platform for the vc4 driver basically done. I can’t wait to do another one. I got a question about whether VC4 could do EGL_ANDROID_native_fence, to allow using Android’s hwcomposer (the display server that puts surfaces in KMS planes if it can). I spent a while looking into it, and found that while code had landed in Mesa for other drivers, the piglit tests had been left around untouched for the last half a year. I reviewed them and hopefully they can land soon. Unfortunately, they don’t cover the important functionality of the extension, so I can’t use them to actually write the driver code (which doesn’t look very hard) until I build some more tests. On the non-vc4 kernel side, I prepared and submitted pull requests for 4.13. We’ve got USB OTG support for the Pi Zero, SDHOST enabled by default finally, thermal enabled by default finally, and the RPi3 DT being installed for 32-bit kernels. Three of these took months longer than they should have, because the Linux kernel has the worst software development process I’ve worked with since XFree86. [Less]