This is the technology worth trillions of dollars huh

  • IngeniousRocks (They/She) @lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    21 minutes ago

    Hey look the markov chain showed its biggest weakness (the markov chain)!

    In the training data, it could be assumed by output that Connecticut usually follows Colorado in lists of two or more states containing Colorado. There is no other reason for this to occur as far as I know.

    Markov Chain based LLMs (I think thats all of them?) are dice-roll systems constrained to probability maps.

    Edit: just to add because I don’t want anyone crawling up my butt about the oversimplification. Yes. I know. That’s not how they work. But when simplified to words so simple a child could understand them, its pretty close.

  • panda_abyss@lemmy.ca
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    1 hour ago

    Yesterday i asked Claude Sonnet what was on my calendar (since they just sent a pop up announcing that feature)

    It listed my work meetings on Sunday, so I tried to correct it…

    You’re absolutely right - I made an error! September 15th is a Sunday, not a weekend day as I implied. Let me correct that: This Week’s Remaining Schedule: Sunday, September 15

    Just today when I asked what’s on my calendar it gave me today and my meetings on the next two thursdays. Not the meetings in between, just thursdays.

    Something is off in AI land.

    Edit: I asked again: gave me meetings for Thursday’s again. Plus it might think I’m driving in F1

    • FlashMobOfOne@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      1 hour ago

      A few weeks ago my Pixel wished me a Happy Birthday when I woke up, and it definitely was not my birthday. Google is definitely letting a shitty LLM write code for it now, but the important thing is they’re bypassing human validation.

      Stupid. Just stupid.

  • Echo Dot@feddit.uk
    link
    fedilink
    English
    arrow-up
    35
    ·
    4 hours ago

    You joke, but I bet you didn’t know that Connecticut contained a “d”

    I wonder what other words contain letters we don’t know about.

      • KubeRoot@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        12
        ·
        2 hours ago

        That actually sounds like a fun SCP - a word that doesn’t seem to contain a letter, but when testing for the presence of that letter using an algorithm that exclusively checks for that presence, it reports the letter is indeed present. Any attempt to check where in the word the letter is, or to get a list of all letters in that word, spuriously fail. Containment could be fun, probably involving amnestics and widespread societal influence, I also wonder if they could create an algorithm for checking letter presence that can be performed by hand without leaking any other information to the person performing it, reproducing the anomaly without computers.

      • I Cast Fist@programming.dev
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 hours ago

        SCP-00WTFDoC (lovingly called “where’s the fucking D of Connecticut” by the foundation workers, also “what the fuck, doc?”)

        People think it’s safe, because it’s “just an invisible D”, not even a dick, just the letter D, and it only manifests verbally when someone tries to say “connecticut” or write it down. When you least expect it, everyone heard “Donnedtidut”, everyone read that thing and a portal to that fucking place opens and drags you in.

        • Echo Dot@feddit.uk
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 hours ago

          That’s how I’ve always heard it pronounced on the rare occasions anybody ever mentions it. But I’ve never been to that part of the US so maybe the accents different there?

      • Aneb@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 hour ago

        I was going to make a joke if you’re from connedicut you never pronounce first d in the word. Conne-icut

  • Djehngo@lemmy.world
    link
    fedilink
    English
    arrow-up
    46
    arrow-down
    1
    ·
    7 hours ago

    The letters that make up words is a common blind spot for AIs, since they are trained on strings of tokens (roughly words) they don’t have a good concept of which letters are inside those words or what order they are in.

    • NoiseColor @lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      35
      ·
      6 hours ago

      I find it bizarre that people find these obvious cases to prove the tech is worthless. Like saying cars are worthless because they can’t go under water.

      • skisnow@lemmy.ca
        link
        fedilink
        English
        arrow-up
        56
        arrow-down
        3
        ·
        edit-2
        5 hours ago

        Not bizarre at all.

        The point isn’t “they can’t do word games therefore they’re useless”, it’s “if this thing is so easily tripped up on the most trivial shit that a 6-year-old can figure out, don’t be going round claiming it has PhD level expertise”, or even “don’t be feeding its unreliable bullshit to me at the top of every search result”.

        • 1rre@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          7
          arrow-down
          12
          ·
          4 hours ago

          A six year old can read and write Arabic, Chinese, Ge’ez, etc. and yet most people with PhD level experience probably can’t, and it’s probably useless to them. LLMs can do this also. You can count the number of letters in a word, but so can a program written in a few hundred bytes of assembly. It’s completely pointless to make LLMs to do that, as it’d just make them way less efficient than they need to be while adding nothing useful.

          • skisnow@lemmy.ca
            link
            fedilink
            English
            arrow-up
            5
            ·
            1 hour ago

            LOL, it seems like every time I get into a discussion with an AI evangelical, they invariably end up asking me to accept some really poor analogy that, much like an LLM’s output, looks superficially clever at first glance but doesn’t stand up to the slightest bit of scrutiny.

            • 1rre@discuss.tchncs.de
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              6 minutes ago

              it’s more that the only way to get some anti AI crusader that there are some uses for it is to put it in an analogy that they have to actually process rather than spitting out an “ai bad” kneejerk.

              I’m probably far more anti AI than average, for 95% of what it’s pushed for it’s completely useless, but that still leaves 5% that it’s genuinely useful for that some people refuse to accept.

          • Echo Dot@feddit.uk
            link
            fedilink
            English
            arrow-up
            1
            ·
            54 minutes ago

            So if the AI can’t do it then that’s just proof that the AI is too smart to be able to do it? That’s your arguement is it. Nah, it’s just crap

            You think just because you attached it to an analogy that makes it make sense. That’s not how it works, look I can do it.

            My car is way too technologically sophisticated to be able to fly, therefore AI doesn’t need to be able to work out how many l Rs are in “strawberry”.

            See how that made literally no sense whatsoever.

            • 1rre@discuss.tchncs.de
              link
              fedilink
              English
              arrow-up
              1
              ·
              20 minutes ago

              Except you’re expecting it to do everything. Your car is too “technically advanced” to walk on the sidewalk, but wait, you can do that anyway and don’t need to reinvent your legs

        • NoiseColor @lemmy.world
          link
          fedilink
          English
          arrow-up
          10
          arrow-down
          17
          ·
          4 hours ago

          I don’t want to defend ai again, but it’s a technology, it can do some things and can’t do others. By now this should be obvious to everyone. Except to the people that believe everything commercials tell them.

            • NoiseColor @lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              ·
              1 hour ago

              Ok? So, what you are saying is that some lawyers are idiots. I could have told you that before ai existed.

          • kouichi@ani.social
            link
            fedilink
            English
            arrow-up
            14
            ·
            4 hours ago

            How many people do you think know that AIs are “trained on tokens”, and understand what that means? It’s clearly not obvious to those who don’t, which are roughly everyone.

              • huppakee@feddit.nl
                link
                fedilink
                English
                arrow-up
                3
                ·
                3 hours ago

                Go to an art museum and somebody will say ‘my 6 year old can make this too’, in my view this is a similar fallacy.

                • NoiseColor @lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  1 hour ago

                  That makes no sense. That has nothing to do with it. What are you on about.

                  That’s like watching tv and not knowing how it works. You still know what to get out of it.

      • knatschus@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        1
        ·
        4 hours ago

        Then why is Google using it for question like that?

        Surely it should be advanced enough to realise it’s weakness with this kind of questions and just don’t give an answer.

        • NoiseColor @lemmy.world
          link
          fedilink
          English
          arrow-up
          9
          ·
          edit-2
          4 hours ago

          They are using it for every question. It’s pointless. The only reason they are doing it is to blow up their numbers.

          … they are trying to be infront. So that some future ai search wouldn’t capture their market share. It’s a safety thing even if it’s not working for all types of questions.

          • TheGrandNagus@lemmy.world
            link
            fedilink
            English
            arrow-up
            7
            ·
            3 hours ago

            The only reason they are doing it is to blow up their numbers.

            Ding ding ding.

            It’s so they can have impressive metrics for shareholders.

            “Our AI had n interactions this quarter! Look at that engagement!”, with no thought put into what user problems it solves.

            It’s the same as web results in the Windows start menu. “Hey shareholders, Bing received n interactions through the start menu, isn’t that great? Look at that engagement!”, completely obfuscating that most of the people who clicked are probably confused elderly users who clicked on a web result without realising.

            Line on chart must go up!

            • NoiseColor @lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 hour ago

              Yeah, but … they also can’t just do nothing and possibly miss out on something. Especially if they already invested a lot.

      • figjam@midwest.social
        link
        fedilink
        English
        arrow-up
        4
        ·
        4 hours ago

        Understanding the bounds of tech makes it easier for people to gage its utility. The only people who desire ignorance are those that profit from it.

        • FishFace@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 hours ago

          Saying “it’s worth trillions of dollars huh” isn’t really promoting that attitude.

        • NoiseColor @lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          3 hours ago

          Sure. But you can literally test almost all frontier models for free. It’s not like there is some conspiracy or secret. Even my 73 year old mother uses it and knows it’s general limits.

  • Multiplexer@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    32
    arrow-down
    1
    ·
    edit-2
    7 hours ago

    No, this is Google throwing the cheapest possible shit at you that is barely capable of stringing together 5 coherent sentences and has the reasoning capability of a tapeworm.

    Here is the output of the minimalist open Chinese model Qwen3, that runs locally on my 6 year old mid-end PC:

    The US states that contain the letter "d" (case-insensitive, as state names are typically capitalized) are as follows. I've verified each state name for the presence of the letter "d" in its standard spelling:
    
        Colorado (contains "d" in "Colorado")  
        Delaware (starts with "D")  
        Florida (contains "d" in "Florida")  
        Idaho (contains "d" in "Idaho")  
        Indiana (contains "d" in "Indiana")  
        Maryland (ends with "d" in "Maryland")  
        Nevada (contains "d" in "Nevada")  
        North Dakota (contains "d" in "Dakota")  
        Rhode Island (contains "d" in "Rhode")  
        South Dakota (contains "d" in "Dakota")  
    
    
    Total: 10 states.
    
    • FauxLiving@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      22 minutes ago

      Exactly.

      The model that responds to your search query is designed to be cheap, not accurate. It has to generate an answer to every single search issued to Google. They’re not using high parameter models with reasoning because those would be ruinously expensive.

      • Multiplexer@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 hours ago

        I didn’t understand your comment, so I asked the same LLM as before.
        It explained it and I think that I get it now. Low-grade middle-school-“Your Mom”-joke, is it? Ha-ha… 🙄

        This also means that AI did better than myself at both tasks I’ve given it today (I found only 9 states with “d” when going over the state-list myself…).

        Whatever. I’m gonna have second lunch now.

  • mrductape@eviltoast.org
    link
    fedilink
    English
    arrow-up
    86
    arrow-down
    1
    ·
    10 hours ago

    Well, it’s almost correct. It’s just one letter off. Maybe if we invest millions more it will be right next time.

    Or maybe it is just not accurate and never will be…I will not every fully trust AI. I’m sure there are use cases for it, I just don’t have any.

    • TheFogan@programming.dev
      link
      fedilink
      English
      arrow-up
      24
      arrow-down
      3
      ·
      edit-2
      9 hours ago

      Cases where you want something googled quickly to get an answer, and it’s low consequence when the answer is wrong.

      IE, say a bar arguement over whether that guy was in that movie. Or you need a customer service agent, but don’t actually care about your customers and don’t want to pay someone, or your coding a feature for windows.

      • Elvith Ma'for@feddit.org
        link
        fedilink
        English
        arrow-up
        12
        ·
        7 hours ago

        How it started:

        Or you need a customer service agent, but don’t actually care about your customers and don’t want to pay someone

        How it’s going:

        IKEA

        Chevy

        • mrductape@eviltoast.org
          link
          fedilink
          English
          arrow-up
          9
          ·
          6 hours ago

          Chatbots are crap. I had to talk to one with my ISP when I had issues. Within one minute I had to request it to connect me to a real person. The problem I was having was not a standard issue, so of course the bot did not understand at all… And I don’t need a bot to give me all the standard solutions, I’ve already tried all of that before I even contact customer support.

        • MagicShel@lemmy.zip
          link
          fedilink
          English
          arrow-up
          5
          ·
          6 hours ago

          The “don’t actually care about your customers” is key because AI is terrible at doing that. And most of the things rich people as salivating for.

          It’s good at quickly generating output that has better odds than random chance of being right. And that’s a niche, but sometimes useful tool. If the cost of failure is high, like a pissed off customer, it’s not a good tool. If the cost is low or failure still has value (such as when an expert is using it to help write code, and the code is wrong but can be fixed with less effort than writing it wholesale).

          There aren’t enough people in executive positions that understand AI well enough to put to good use. They are going to become disillusioned, but not better informed.

  • DUMBASS@leminal.space
    link
    fedilink
    English
    arrow-up
    19
    ·
    9 hours ago

    Gemini is just a depressed and suicidal AI, be nice to it.

    I had it completely melt down one day while messing around with its coding shit, I had to console it and tell it it’s doing good, we will solve this, was fucking weird as fuck.

  • Arghblarg@lemmy.ca
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    3
    ·
    9 hours ago

    “AI” hallucinations are not a problem that can be fixed in LLMs. They are an inherent aspect of the process and an inevitable result of the fact that LLMs are mostly probabilistic engines, with no supervisory or introspective capability, which actual sentient beings possess and use to fact-check their output. So there. :p

    • sexybenfranklin@ttrpg.network
      link
      fedilink
      English
      arrow-up
      5
      ·
      3 hours ago

      It’s funny seeing the list and knowing connecticut is only there because it’s alphabetically after colorado (in fact all four listed appear in that order alphabetically) because they probably scraped so many lists of states that the alphabetical order is the statistically most probable response in their corpus when any state name is listed.

    • Zwuzelmaus@feddit.org
      link
      fedilink
      English
      arrow-up
      4
      ·
      8 hours ago

      inevitable result of the fact that LLMs are mostly probabilistic engines

      So we should better put the question like

      “What is the probability of a D suddenly appearing in Connecticut?”

  • Mrkawfee@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    edit-2
    7 hours ago

    You don’t get it because you aren’t an AI genius. This chatbot has clearly turned sentient and is trolling you.

    • FauxLiving@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      17 minutes ago

      It doesn’t take an AI genius to understand that it is possible to use low parameter models which are cheaper to run but dumber.

      Considering Google serves billions of searches per day, they’re not using GPT-5 to generate the quick answers.

  • chaosCruiser@futurology.today
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    2
    ·
    edit-2
    7 hours ago

    In Copilot terminology, this is a “quick response” instead of the “think deeper” option. The latter actually stops to verify the initial answer before spitting it out.

    Deep thinking gave me this: Colorado, Delaware, Florida, Idaho, Indiana, Maryland, North Dakota, Rhode Island, and South Dakota.

    It took way longer, but at least the list looks better now. Somehow it missed Nevada, so it clearly didn’t think deep enough.

    • skisnow@lemmy.ca
      link
      fedilink
      English
      arrow-up
      15
      arrow-down
      2
      ·
      edit-2
      5 hours ago

      “I asked it to burn an extra 2KWh of energy breaking the task up into small parts to think about it in more detail, and it still got the answer wrong”

      • chaosCruiser@futurology.today
        link
        fedilink
        English
        arrow-up
        4
        ·
        4 hours ago

        Yeah that pretty much sums it up. Sadly, it didn’t tell me how much coal was burned and how many starving orphan puppies it had to stomp on to produce the result.