Or my favorite quote from the article

“I am going to have a complete and total mental breakdown. I am going to be institutionalized. They are going to put me in a padded room and I am going to write… code on the walls with my own feces,” it said.

  • Showroom7561@lemmy.ca
    link
    fedilink
    English
    arrow-up
    35
    ·
    3 days ago

    I once asked Gemini for steps to do something pretty basic in Linux (as a novice, I could have figured it out). The steps it gave me were not only nonsensical, but they seemed to be random steps for more than one problem all rolled into one. It was beyond useless and a waste of time.

    • prole@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      10
      ·
      2 days ago

      This is the conclusion that anyone with any bit of expertise in a field has come to after 5 mins talking to an LLM about said field.

      The more this broken shit gets embedded into our lives, the more everything is going to break down.

      • jj4211@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        2 days ago

        after 5 mins talking to an LLM about said field.

        The insidious thing is that LLMs tend to be pretty good at 5-minute initial impressions. I’ve seen repeatedly people looking to eval LLM and they generally fall back to “ok, if this were a human, I’d ask a few job interview questions, well known enough so they have a shot at answering, but tricky enough to show they actually know the field”.

        As an example, a colleague became a true believer after being directed by management to evaluate it. He decided to ask it “generate a utility to take in a series of numbers from a file and sort them and report the min, max, mean, median, mode, and standard deviation”. And it did so instantly, with “only one mistake”. Then he tried the exact same question later in the day and it happened not to make that mistake and he concluded that it must have ‘learned’ how to do it in the last couple of hours, of course that’s not how it works, there’s just a bit of probabilistic stuff and any perturbation of the prompt could produce unexpected variation, but he doesn’t know that…

        Note that management frequently never makes it beyond tutorial/interview question fodder in terms of the technical aspect of their teams, and you get to see how they might tank their companies because the LLMs “interview well”.