I’ve been seeing a lot of talk about CachyOS recently. Has anyone here tried it? It seems interesting and I might give it a go (currently on EndeavourOS) on a spare drive in my PC.
I’ve been seeing a lot of talk about CachyOS recently. Has anyone here tried it? It seems interesting and I might give it a go (currently on EndeavourOS) on a spare drive in my PC.
It’s not really the same thing. EndeavourOS is basically vanilla Arch + a few branding packages. CachyOS is an opionated Arch with optimised packages.
You still have the option to select the DE and the packages you want to install - just like EndeavourOS - but what sets Cachy apart is the optimisations. For starters, they have multiple custom kernel options, with the BORE scheduler (and a few others), LTO options etc. Then they also have packages compiled for the x86-64-v3 and v4 architectures for better performance.
Of course, you could also just use Arch (or EndeavourOS) and install the x86-64-v3/v4 packages yourself from ALHP (or even the Cachy repos), and you can even manually install the Cachy kernel or a similar optimised one like Xanmod. But you don’t get the custom configs / opinionated stuff. Which you many actually not want as a veteran user. But if you’re a newbie, then having those opinionated configs isn’t such a bad idea, especially if you decide to just get a WM instead of a DE.
I’ve been thru all of the above scenarios, depending on the situation. My homelab is vanilla Arch but with packages from the Cachy repo. I’ve also got a pure Cachy install on my gaming desktop just because I was feeling lazy and just wanted an optimised install quickly. They also have a gaming meta package that installs Steam and all the necessary 32-bit libs and stuff, which is nice.
Then there’s Cachy Browser, which is a fork of LibreWolf with performance optimisations (kinda similar to Mercury browser, except Mercury isn’t MARCH optimised).
As for support, their Discord is pretty active, you can actually chat with the developers directly, and they’re pretty friendly (and this includes Piotr Gorski, the main dev, and firelzrd - the person behind the BORE scheduler). Chatting with them, I find the quality of technical discussions a LOT higher than the Arch Discord, which is very off-topic and spammy most of the time.
Also, I liked their response to Arch changes and incidents. When Arch made the recent mkinitcpio changes, their made a very thorough announcement with the exact steps you needed to take (which was far more detailed than the official Arch announcement). Also, when the xz backdoor happened, they updated their repos to fix it even before Arch did.
I’ve also interacted with the devs pesonally with various technical topics - such as CFLAG and MARCH optimisations, performance benchmarking etc, and it seems like they definitely know their stuff.
So I’ve full confidence in their technical ability, and I’m happy to recommend the distro for folks interested in performance tuning.
cc: @governorkeagan@lemdro.id
It was my understanding that was all but pointless to do these days.
That depends on your CPU, hardware and workloads.
You’re probably thinking of Intel and AVX512 (x86-64-v4) in which case, yes it’s pointless because Intel screwed up the implementation, but on the other hand, that’s not the case for AMD. Of course, that assumes your program actually makes use of AVX512. v3 is worth it though.
In any case, the usual places where you’d see improvements is when you’re compiling stuff, compression, encryption and audio/video encoding (ofc, if your codec is accelerated by your hardware, that’s a moot point). Sometimes the improvements are not apparent by normal benchmarks, but would have an overall impact - for instance, if you use filesystem compression, with the optimisations it means you now have lower I/O latency, and so on.
More importantly, if you’re a laptop user, this could mean better battery life since using more efficient instructions, so certain stuff that might’ve taken 4 CPU cycles could be done in 2 etc.
In my own experience on both my Zen 2 and Zen 4 machines, v3/v4 packages made a visible difference. And that’s not really surprising, because if you take a look the instructions you’re missing out on, you’d be like ‘wtf’:
CMPXCHG16B, LAHF-SAHF, POPCNT, SSE3, SSE4_1, SSE4_2, SSSE3, AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, OSXSAVE
.And this is not counting any of the AVX512 instructions in v4, or all the CPU-specific instructions eg in
znver4
.It really doesn’t make sense that you’re spending so much money buying a fancy CPU, but not making use of half of its features…
[citation needed]
Those would show up in any benchmark that is sensitive to I/O latency.
Also, again, [citation needed] that march optimisations measurably lower I/O latency for compressed I/O. For that to happen it is a necessary condition that compression is a significant component in I/O latency to begin with. If 99% of the time was spent waiting for the device to write the data, optimising the 1% of time spent on compression by even as much as 20% would not gain you anything of significance. This is obviously an exaggerated example but, given how absolutely dog slow most I/O devices are compared to how fast CPUs are these days, not entirely unrealistic.
Generally, the effect of such esoteric “optimisations” is so small that the length of your unix username has a greater effect on real-world performance. I wish I was kidding.
You have to account for a lot of variables and measurement biases if you want to make factual claims about them. You can observe performance differences on the order of 5-10% just due to a slight memory layout changes with different compile flags, without any actual performance improvement due to the change in code generation.
That’s not my opinion, that’s rather well established fact. Read here:
So far, I have yet to see data that shows a significant performance increase from march optimisations which either controlled for the measurement bias or showed an effect that couldn’t be explained by measurement bias alone.
There might be an improvement and my personal hypothesis is that there is at least a small one but, so far, we don’t actually know.
The more realistic case is that an execution that would have taken 4 CPU cycles on average would then take 3.9 CPU cycles.
I don’t have data on how power scales with varying cycles/task at a constant task/time but I doubt it’s linear, especially with all the complexities surrounding speculative execution.
“visible” in what way? March optimisations are hardly visible in controlled synthetic tests…
These features cater towards specialised workloads, not general purpose computing.
Applications which facilitate such specialised workloads and are performance-critical usually have hand-made assembly for the critical paths where these specialised instructions can make a difference. Generic compiler optimisations will do precisely nothing to improve performance in any way in that case.
I’d worry more about your applications not making any use of all the cores you’ve paid good money for. Spoiler alert: Compiler optimisations don’t help with that problem one bit.
I really wish they would stop using Discord. None of the content is even visible on a search engine. It’s a closed door system.
Not true, that company recently scraped all Discord so now at least the bad guys have a working search engine! One step at a time. /jk
Yep, fair point. I’m a fan of old-school forums myself, like phpBB.
Thank you for the detailed answer, I really appreciate. I’ve had this EOS install for almost 3 years now and I have multiple drives that are full of things. Very happy with it, too. Moving distros for me isn’t as easy as it used to be because of the drives and all the things that I have set up. I don’t want to go through the pain of re-setting everything up. I’ll, however, try cachyOS either in a vm or a little laptop I have that I use for trying things for fun.
If you’re already on Arch/EOS, you don’t need to “move distros”, all you need to do (ish) is to update your pacman.conf with Cachy’s repos and run a
pacman -Syuu
to reinstall your packages. Oh, and you might also want to install the cachy kernel and maybe the browser for the full experience. Your files and config will remain the same, unless you plan to update/merge them - in which case, I’d recommend replacing your makepkg.conf with the one Cachy provides, for the optimised compiler flags. Other than that, there’s no significant difference between the default configs and Cachy’s. In fact, EndeavourOS actually deviates more since it uses dracut for generating the initrd, whereas Cachy, like Arch, defaults to mkinitcpio.Anyways, there’s not much point trying CachyOS in a VM since it’s really not that much different from EndeavourOS (from a UX point of view); the whole point of Cachy is to eke out the best performance from your system, so running it in a VM defeats the purpose.
You’ve answered another question I had (I asked it in another comment), thank you! I’ll give the kernel and Cachy repo a try on my EOS install and see how it goes. Thanks again for the detailed response, it’s super useful!
I installed CachyOS on a VM ( Proxmox ) just to check out the OOTB experience and I am glad I did.
In a lot of ways, it is similar to EOS as you say. That is a compliment as I really like EOS.
The UX is a bit different though. Lots more blue than purple of course. On the command-line side the differences are bigger. It uses the fish shell with a jazzed up prompt ( reminded me of Garuda ). There are a tonne of aliases. They clearly like Rust as a few of the Rust core util alternatives are installed. They even alias ls to eza.
Both yay and paru are installed at install which is awesome.
The default file system was XFS. Btrfs and zfs were both options. No bcachefs at install but it is available after.