I let Claude Code configure my Arch install

16 points by willmorrison a day ago on lobsters | 30 comments

This was one of the things I hoped AI would help me with: Issues with broken Linux drivers and similar issues. You know, the actual boring stuff I don't really want to learn. So I let Claude Code attempt to solve my broken Intel MIPI camera on a Carbon Thinkpad, and after a million diagnostic commands (some which didn't even work), Claude confidently told me how to fix it. I tried it and it didn't work (unsurprisingly), and I repeated this 1-2 more times. At the end of this, my system was in a state where opening cheese somehow caused my bluetooth headset to sometimes disconnect from my machine. The actual solution here was to drag in some newer packages from Debian testing, which I found by using my own brain and a bit of research.

The same experience happened when I tried to get ssh-askpass (or something equivalent) to work on Wayland with Niri: Claude couldn't help me to get it up and running, but deep into some random Reddit thread I found a comment that partially worked, and changed some configuration to make it work on Debian.

It's not that I don't believe the author... but when I read all these stories about how amazing LLMs are at solving problems, it feels like I'm being gaslit. It just doesn't match my experience at all.

stapelberg | 17 hours ago

Claude confidently told me how to fix it. I tried it

Here’s your problem. You need to let Claude try, then it can validate its attempts and iterate. If you (the human) are the feedback loop, then yes, it’s very frustrating. You need to give the agent its own feedback loop.

I have had Claude configure NixOS servers and desktops for 6 months+ at this point.

I have had Claude troubleshoot and come up with a Nvidia GPU driver workaround so that I could start Wayland for the first time on my machine: https://michael.stapelberg.ch/posts/2026-01-04-wayland-sway-in-2026/

I can feed my journalctl or systemd-analyze into Claude and have it diagnose and troubleshoot.

: hyPiRion | 7 hours ago
To clarify, when I say "I tried it", I meant that I vetted the commands before letting Claude run them.

But regardless, I am not really sure how I would let Claude go ahead and verify that it had completed its task when it already tried to do so. Initially it tried to take a camera snapshot with gstreamer, then declared success when it managed to do so --- without inspecting the content and realising that the output was pure black. Then when I told it to inspect it, it did some more work and got the camera initialisation to work (I presume) as it now got a working screenshot.. but no more frames than the initial one. After much tinkering it found a way to get the camera running about 50% of the time with gstreamer, 50% of the time having an amazing 4 fps... and also not working at all with cheese, Chrome, Firefox etc. Then I tried to get Claude to get it working with cheese without any visible progress.

So what, exactly, should I have done here? Predicted all these error situations, and told it to take a video with Chrome/cheese, ensure that the video contains something else than black, and also ensure that the video had 30fps and that it could do so consistently? Part of the problem is that verifying that it works isn't something you can do easily with text, and even if it's possible with image and video inspection, knowing all these potential error states ahead of time was something neither I nor Claude knew before I actually started on this adventure.

whalesalad | a day ago

Claude fixed my Apple Studio Display on linux. I have it running thru a Thunderbolt add-in card. Had to do some non-stable kernel upgrades and a few other things - like forcing the port into a specific mode to avoid artifacting (this runs on boot):

#!/bin/bash
sleep 5
echo "4 0x1e" > /sys/kernel/debug/dri/0/DP-3/link_settings

It would have taken me ages to figure this out on my own. Now I can explore a yak shave in a few minutes instead of putting it off for yet another weekend to deep dive.

I am also using it to completely re-engineer my Homelab. Everything is infrastructure-as-code (ansible + proxmox foundations), integrated with Unifi. FreeIPA 'domain controller' running and managing all user/host auth internally, replaced TrueNAS with a DIY simple version of Ubuntu+ZFS leveraging FreeIPA for access control. I just got SMB on Mac working with Kerb, and last nigth NFS with Kerb too. It helped me benchmark different strategies to evaluate krb5e encryption overhead vs krb5i without encryption. Everything is centrally managed in a single git repo. Really loving the way I have been able to tackle this in tiny little incremental steps, writing daily journal logs of progress, and slowly refining this into the dream setup I have wanted for ages.

some recent commits on that front:

702dbcf (HEAD -> main, origin/main, origin/HEAD) fix: gssproxy duplicate config + journal for NFS client session
359e604 feat: NFSv4 + Kerberos server on tank, runbook + plan updates
79cbe94 Merge pull request #16 from whalesalad/copilot/setup-nfs-on-tank
4882259 (origin/copilot/setup-nfs-on-tank) Fix lucifer OS: CachyOS → Debian 12 across runbook and FreeIPA research docs
795ef05 Fix escaped pipe character in troubleshooting table
306602f Add NFS setup runbook: NFSv4 + krb5p on tank, non-domain-joined lucifer, automount
4f8ee21 Initial plan
225f21b feat: SMART monitoring + weekly scrub cron on tank
37e2edf feat: ZED email monitoring + mail relay client role
f5d5d23 feat: central SMTP relay via Postfix + AWS SES
05b9270 questionable
918c1a5 Merge pull request #15 from whalesalad/claude/ubuntu-debloat-smb-mac-HsI5X
fd7b4c6 fix: chrony service name (chronyd → chrony) and add NTP config
547b282 fix: scope base playbook to explicit ubuntu group, not all hosts
939d58d cloud doc consolidation doc
375c413 fix: add avahi for Bonjour/mDNS — required for macOS Finder discovery
3a5c873 feat: ubuntu base role (debloat), smb mac polish, retire time machine
a4935fa Add ubuntu_base + ubuntu_domain roles, fix SSSD boot resilience
b8f1157 Merge pull request #14 from whalesalad/copilot/investigate-smb-home-issue
58aee99 (origin/copilot/investigate-smb-home-issue) fix: use homes-safe token for SMB home share mapping
b2c3dd1 Initial plan
55adfac huge progress today, exhasuted

glhaynes | 23 hours ago

I'm extremely pleased with having Claude Code administer my NixOS machine, exactly because it doesn't fall into the same pits as your experience: it's just running benign commands that don't alter the system and editing text files (my Nix flake) that all live in git. I can rebuild after it makes some edits, test it out, and hop right back to where we were if they didn't work.

If I were running it in a "traditional" system, even if it were working well, I'd constantly feel like we were leaving behind lots of potentially-harmful cruft and spurious config changes from paths we went down that didn't work. Hell, I feel that way trying to administer a system by hand.

altano | 16 hours ago

I've had the same experience: Claude Code is a great UI over NixOS.

vpr | 20 hours ago

I can't imagine using any type of agentic model to fix a stateful machine, especially in the presence of hardware and firmware. Maybe for a script in a VM that's easy to redeploy...

Their bash-fu is bizarre and they don't check the manuals/source nearly enough for my taste without explicit instruction.

: glhaynes | 19 hours ago
That's another thing I like about the combo: having my system config declarative in the Nix language (which has to be valid for a rebuild) gives some of that "agentic loop" magic and less anxiety about what a bungled line might mean for my system's state. Plus, the changes tend to be infrequent/small enough that I'm not tempted to always-accept-changes or to not read them all.

dpc_pw | 20 hours ago

LLMs appear brilliant when they figure out something the user didn't know how to do, they appear very dumb when they struggle with something the user can eventually figure out themselves. Nothing too surprising IMO.

And that's not unlike my own troubleshooting. Sometimes I'll figure out something non-trivial and feel good about myself, and sometimes I realized in the hindsight that I failed due to incorrect assumptions or missing knowledge, and feel a bit dumb.

heckie | 23 hours ago

I've had good experiences using Claude code to help me diagnose weird mDNS issues. The end solution ended up just being running a local dnsmasq and abandoning mDNS.

it's a "bit more work" than "just configuring avahi and nsswitch.conf" on every host, since I'm now running another service in my homelab, but most of it boiled down to "I don't want to spin up the docker image, do the config for one or two local hostnames" - claude did it in like, 15 minutes while I was futzing with the actual project work that the broken mDNS resolution was intermittently getting in the way of.

the other big win for claude in my homelab: I was putting off fixing up & reintegrating my grafana/prometheus with all my homelab services, because I don't want to do monitoring work in my spare time. Glueing that all together is pretty trivial, though, just tedious.

Usually, the way I have to do this is to first go through root-causing - as far down as necessary - and then working back upwards.

threkk | a day ago

My overall enjoyment has gone down as agentic tools have gotten better. I can feel myself losing skills. I've started projects where I don't use LLMs at all to slow that down. But at work, the productivity gains are real, and I think using these tools is the right decision.

I agree on this. At work we have a huge range of AI tools to support us, and in many situations they are the fastest way of getting the job done, so it is hard to justify not using them as it is faster. However, I also feel my research, debugging, and sometimes even coding skills deteriorating, so for my personal projects I do not use them to keep myself sharp. At the end, life is not a race but a marathon.

Blintk | a day ago

I will worry about this so long as we don't have on par open source models we can run on our own machines. The risk of intelligence brownouts is real and worrying.

simonw | a day ago

The Qwen 3.5 models are shockingly good - the qwen3.5-35b-a3b one runs comfortably on my 64GB Mac and really does appear competitive with the best closed model of ~18 months ago.

enobayram | a day ago

Have you ever tried these models for agentic coding? And not just the local/smaller versions like 35b, but the full 397b versions etc. I'm wondering what we'll be left with when the AI vendors inevitably fully enshittify themselves.

simonw | a day ago

No, I need to try having them drive a coding agent properly. That's been a huge issue with previous local models I've tried - driving a Claude Code requires them to stay useful across dozens or even hundreds of tool calls.

(I compared them favorably to a ~18 month old model, but those couldn't run agent harnesses effectively either.)

enobayram | a day ago

I see, thanks for the answer. I always feel uncomfortable with any workflow that relies on AI vendors remaining reasonable, which we all know to be a very temporary state of affairs.

gerikson | a day ago

In your opinion, what is the minimum hardware requirements for running a local model productively?

(Also, relevant to this site's interests, are all local models runnable on FLOSS operating systems?)

If you require very beefy hardware to get access to GenAI, naturally the centralized solutions will win out. Especially as computer hardware components are experiencing a significant price increase at the consumer level, in part because of increased demand from the centralized solutions.

: k749gtnc9l3w | a day ago
My setup: used Ryzen9 miniPC (not that fresh of a generation), 64 GiB system RAM, Linux, iGPU, Vulkan, llama.cpp. With less RAM you have issues running 30-ish-B class models on iGPU, and 15B-ish class is visibly more constrained. With sane RAM market the setup was not too expensive (and reasonable for other tasks too). Can helium supply run out already and cause a market panic around datacenter investments?

refi64 | 23 hours ago

I'm moderately curious if you've tried one of the qwen3-coder-next quants (e.g. bartowski's or dinerburger's) yet, as a comparison.

timthelion | a day ago

If GLM5 gets open sourced it will be great as GLM5 is dmart enough for agentic coding...

kraxen72 | a day ago

it's open-weight, no? you can go download it on hugging face. if you mean releasing the training data.. i don't think that's happening. also GLM 5 is huge, i don't think you could realistically run it on consumer hardware, maybe with extreme quantization. (there's some 2-bit quant at ~241gb that you could technically squeeze onto a max-spec mac studio/pro.)

: timthelion | a day ago
Aha I hadn't realized that it had been released yet. Sure it can't run on consumer hardware. We need workers collectives to collectively pay for such hardware we can share.

k749gtnc9l3w | a day ago

I used Qwen3-Coder-30B-A3B quantised to 6 bits with Qwen-Code, and asked it to do write me a Nix expression for some stuff. I cleared it out rather frequently. It did OK on some parts, sometimes wanted to go off the rails, and failed to resolve a problem that now seems unsolvable to me after trying to fix the best attempt by Qwen. Of course, locally it is rather slow (but running in background for things I do not get around to does not need me to interact with slowness! also applies to non-agentic use, which I do more)

ocramz | 7 hours ago

I've been curious about async LLM ergonomics as well. What are your use cases/patterns?

: k749gtnc9l3w | 6 hours ago
I mostly use direct chat / local API not agent.

It is kind of fast enough for semi-sync: I can look whether what it starts to output (or to put into chain-of-thought) seems reasonable or points to a missed assumption in my question, and restart with a better question if it goes into a wrong direction.

If it looks like there is a chance of a good result, I let it be. When the fans of the miniPC settle down, and I have time, I look at the result. Works particularly well when I am also chatting in a few semi-sync chats, as well as sorting the email notifications — I am already in a maximally context switching mode anyway.

Sometimes it is something longer if I think chances are good.

I mostly ask it for painfully boring things that are more ergonomic to test once I have them, or for language translations, so agentic part is not as useful to me. But short-ish tasks given to an agent more or less worked for fixing trivial missed dependency issues (until hitting the part that might have been simply impossible).

kraxen72 | a day ago

fwiw, i've used perplexity pretty extensively whenever i'm troubleshooting some linux issue or configuring something. last time when i wanted to set up spotify-qt with librespot, or when i was setting up snapper on fedora with btrfs. obviously i try to follow some tutorial if it's a well known thing (e.g. snapper w btrfs) but as some point i'll inevitably deviate.

i have played around with linux since like 2018 but i have an dual gpu laptop so i didn't make the switch from windows to linux full time until summer 2024.

i use perplexity (just through a webui), so i'm running (and vetting) all the commands myself. even if i don't fully understand all the commands, i can generally tell if i like the direction it's going, and if i don't like it, i push back, asking for feedback for the solution that i think might work.

perplexity's nice because it's built around web search, so it usually doesen't make stuff up, but nowadays even claude and chatgpt can browse the web, i'm just used to perplexity.

i think that the main advantage for me is that it saves me a lot of frustration. in the past, i would often spend ages browsing through old archived forums, trying solution after solution, which didn't work, and when i finally fixed it, most of what i felt wasn't satisfaction that i solved the problem, but rather that i can finally stop spending time on this. basically i'm not a huge fan of the act of troubleshooting - i would just rather have the system work as i expect it to :)

before perplexity, i had that one linux friend i would annoy about stuff like this, but ever since he started college and got a job, he doesen't have that much time to help me troubleshoot some linux issue for 4 hours in the middle of the night. (if you wanna be my new linux friend, lmk, i'd love to have more linux friends)

back to the topic of the post: some could argue (correctly) that my approach is not as hands-off as claude code, but i think it's a good thing. I want to learn more about linux, and i want to understand the problems, why they're happening and what the commands that fix them do.

whenever i'm troubleshooting something/setting somethig up, i usually keep a markdown doc of steps i took as i go, in a tutoral/guide style, for more than one reason:

it helps me stay on track, and i like keeping a "personal wiki"
if i have to split the troubleshooting up into multiple sessions, i can then continue with a fesh llm chat + this as summary
on more than one occasion, i've revisited a doc like this to see how i set something up or what that one command was.
sometimes my friend who i got into linux wants to have some setup/workflow i have, so i can just send them this as a tutorial
i keep telling myself that one day i'll publish these on my blog (which i have not yet created)

here's an example: https://share.note.sx/qmtzutp3#OgYatCPYeyrbveRbsC5k4TLbJUYV4OOQo3fzyP69PaE

anyway, i guess my point is that llms (especially with web search) can be pretty useful when troubleshooting/configuring linux, but i don't like having claude code/an agent do everything for me.

andreicek | a day ago

For me, after switching to Linux and back for the past 10 years, having claude always open in my dotfiles repo made all the difference.

I will debug issues with my install and it will fix what ever is broken today. Made if from an unstable time sink to a rock solid system I look forward using every day.

quindarius | a day ago

However, I spent maybe 95% less time on this install than any previous one, and the result is better than anything I've built by hand. I think that's worth it.

Man not only wants to be loved, but to be lovely.

Adam Smith

mudkip | 13 hours ago

Something underrated about NixOS is AI can write your entire system configuration for you. It also alleviates some of the pain of NixOS because you can package obscure software and turn things declarative easily.

saghul | a day ago

A cool follow-up would be to have Claude dump that knowledge in an ansible playbook. This way now you have a deterministic way to (re)setup your computer the way you like.