• 0 Posts
  • 32 Comments
Joined 2 years ago
cake
Cake day: August 12th, 2023

help-circle
  • I am generally a sceptic myself, especially in my own area, which is software development. But recently in a board game community, someone was scolded for asking ChatGPT about a rule dispute (and it was wrong). All upvotes to unhelpful “AI bad” comments. I pointed out that while this was true 3 months ago, ChatGPT 5 (and only that one) can very accurately answer such questions when asked the right way, showed how to ask the user question and the (now correct) response, and mentioned my 35 board game test questions and results with major LLM flagship models. (Almost all LLMs did horribly, under 70% even in yes/no questions, but ChatGPT 5 with specific instructions or “Thinking” model got 100%.)

    Even as a sceptic, I can acknowledge that LLMs just jumped from completely useless to perfect in the past few months when it comes to this specific niche.



  • I find it also saves a certain “mental energy”.

    E. g. when I worked on a program to recover data from the old discontinued Windows photo app: I started 2 years ago and quickly had a proof-of-concept: Found out it’s just sqlite format, checked out the table structure, made a query to list the files from one album. So at that point, it was clear that it was doable, but the remaining 90 % would be boring.

    So after 2 years on pause, I just gave Gemini 2.5Pro the general problem and the two queries I had. It 1-shot a working powershell script, no changes required. It reads directly from the sqlite (imagine the annoyance to research that when you never ever use powershell!) and put the files to folders named by the former albums. My solution would have been worse, would probably have gone with just hacking together some copy-commands from SELECT and run them all once.

    That was pretty nice: I got to do the interesting part of building the SQL queries, and it did the boring, tiring things for me.

    Overall, I remain sceptical as you do. There is definitely a massive bullshit-bubble, and it’s not clear yet where it ends. I keep it out of production code for now, but will keep experimenting on the side with an “it’s just code completion” approach, which I think might be viable.


  • Currently, I write all production code at work without any AI assistance. But to keep up with things, I do my own projects.

    Main observation: When I use it (Claude Code + IDE-assistant) like a fancy code completion, it can save a lot of time. But: It must be in my own area of expertise, so I could do it myself just as well, only slower. It makes a mistake about 10 - 20 % of the time, most of them not obvious like compile errors, so it would turn the project into disaster over time. Still, seems like a senior developer could be about 50% - 100% more productive in the heat of the implementation phase. Most important job is to say “STOP” when it’s about to do nonsense. The resulting code is pretty much exactly how I would have done it, and it saved time.

    I also tried “vibe coding” by using languages and technologies that I have no experience with. It resulted in seemingly working programs, e. g. to extract and sort photos from an outdated data file format, or to parse a nice statistics out of 1000 lines of annual private bank statements. Especially the latter resulted in 500 lines of unmaintainable Python-spaghetticode. Still nice for my private application, but nobody in the world can guarantee that there aren’t pennies missing, or income and outcome switched in the calculation. So unusable for the accounting of a company or anything like that.

    I think it will remain code completion for the next 5 years. The bubble of trying more than next-gen code completion for seniors will burst. What happens then is hard to say, but it takes significant breakthroughs to replace a senior and work independently.


  • Yes, I had to delete lemmy, reddit, twitter, mastodon, all games etc.

    But I see 0 harm in:

    • 2FA authenticator apps (google authenticator, app for government ID, bank, …)
    • DHL (unlocks packing station / parcel distributing machine here)
    • calendar (with voice assistant)
    • Pixel, iPhone, Samsung and some others are a fantastic camera! 10 years ago, it’d be a great deal just for that one feature. I used to pay USD/EUR 250 - 500 for a hobby-level camera that was worse
    • read my mobile CO2 sensor
    • not crucial, but occasionally show someone something in a video call
    • send injured animal photo / video right to the wildlife rescue station for advice (~ 2x per year)
    • plain old mp3 player
    • some might read eBooks, which is a good use of it, but I still prefer a hardcopy

    So yes, on my 2nd smartphone only (first in 2021), but I find that it’s worth it these days.

    Enshittification intensifies, but a Linux phone might become very viable in a few years, especially when LLM adapters become easier to use. Self-hosted alternatives to google/apple photos are already very advanced.





  • This is insane. We’ll end up with this white-washed, boring mediocrity.

    Same with that MAGA-actress from The Mandalorian. What she said wasn’t even as radical, just bought into the whole election fraud conspiracy, iirc. Now we have a formerly good show where every new character AND their actor are basically PC-principal from South Park. The left failed to support her there because she was a bit of an ass, supporting a bigger ass, but no laws were broken and she fit the role perfectly: Not that smart, badass, hardly an ally when you have no other choice.

    Enlightened centrists are left to create our art and literature. Well, at least the 1800s and 1900s produced a lifetime supply. But a contemporary reflection of our society, of our politics, like “1984”, like “Der Untertan”, like “Im Westen nichts Neues”, that used to be a huge part of it, and it’s slipping away.

    Fiction is an actor we should not pull out of public discourse. It has been crucial to it, probably long before the first written word. Unless a crime is committed, let them write!









  • Interesting what he wrote about LLMs’ inability to “zoom out” and see the whole picture. I use Gemini and ChatGPT sometimes to help debug admin / DevOps problems. It’s a great help for extra input, a bit like rubberducking on steroids.

    Examples how it went:

    Problem: Apache-cluster and connected KeyCloak-Cluster, odd problems with loginflow. Reducing KeyCloak to 1 node solves it, so it says that we need to debug node communication and how to set the debug log settings. A lot of analysis together. But after a while, it’s pretty obvious that the Apache-cluster doesn’t use the sticky session correctly and forwards requests to the wrong KeyCloak node in the middle of the login flow. LLM does not see that, wanted to continue to dig deeper and deeper into supposedly “odd” details of the communication between KeyCloak nodes, althought the combined logs of all nodes show that the error was in load balancing.

    Problem: Apache from a different cluster often returns 413 (payload too large). Indeed it happens with pretty large requests, the limit where it happens is a big over 8kB without the body. But the incoming request is valid. So I ask both Gemini and ChatGPT for a complete list of things that cause Apache to do that. It does a decent job at that. And one of it is close: It says to check for mod_proxy_ajp use, since that observed limit could be caused by trying to make an AJP package to communicate with backchannel servers. It was not the cause; the actual mod was mod_jk, which also uses AJP. It helped me focus on watching out for anything using AJP when reviewing the whole config manually, so I found it, and the “rubberducking” helped indirectly. But the LLM said we must forget about AJP and focus on other possible causes - a dead end. When I told it the solution, it was like: Of course mod_jk. (413 sounds like the request TO the apache is wrong, but actually, it tries internally to create an invalid AJP package over 8kB, and when it fails blames the incoming request.)