• Lugh@futurology.today
    link
    fedilink
    English
    arrow-up
    16
    ·
    2 days ago

    At least this should finally put the ‘Chinese can’t innovate, they can only copy’ meme into retirement.

  • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
    link
    fedilink
    arrow-up
    15
    ·
    2 days ago

    It’s interesting how the media focuses on the panic at Meta. While they’ve been pursuing open-source models like LLaMA, OpenAI appears far more impacted, as their business relies on selling access to a proprietary model-as-a-service.

      • 1984@lemmy.today
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        2 days ago

        Or it’s important to the media companies to not alienate Microsoft because of reasons.

        I mean, it’s very strange. Open Ai is the obvious loser on this, not Facebook. Obviously Microsoft doesn’t want the press reminding people of alternatives to the big tech models.

      • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
        link
        fedilink
        arrow-up
        3
        ·
        2 days ago

        I mean there’s been a lot of news about DeepSeek in the past few days, but very little has been said regarding how this impacts the company that’s most affected by this development.

  • anachronist@midwest.social
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    2 days ago

    This whole DeepSeek freakout seems like an Op by the AI grifters to get more money. “We have to defeat China at the new AI space race!”

    • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
      link
      fedilink
      arrow-up
      6
      ·
      2 days ago

      The freakout is over SV grift being exposed for what it is. Turns out you don’t need to pour billions of dollars into this industry to get results.

  • melroy@kbin.melroy.org
    link
    fedilink
    arrow-up
    3
    arrow-down
    2
    ·
    2 days ago

    DeepSeek is not that great. I run it here locally, but the answers are often still wrong. And I get Chinese characters in my English output

      • melroy@kbin.melroy.org
        link
        fedilink
        arrow-up
        2
        arrow-down
        1
        ·
        2 days ago

        Yes that is true… now the question I have back is: How is this price calculated? I mean the price can also be low, because they ask less. Or the price can be low because interference costs less time / energy. You might answer the latter is true, but where is the source for that?

        Again, since I can run it locally my price is $0 per million tokens, I only pay electricity for my home.

        EDIT: The link you gave me also says “API costs” at the top of the article. So that means, they just ask less money. The model itself might use the same amount (or even more) energy than other existing models costs.

        • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
          link
          fedilink
          arrow-up
          5
          ·
          2 days ago

          The reason they ask for less money is due to the fact that it’s a more efficient algorithm, which means it uses less power. They leveraged mixture-of-experts architecture to get far better performance than traditional models. While it has 671 billion parameters overall, it only uses 37 billion at a time, making it very efficient. For comparison, Meta’s Llama3.1 uses 405 billion parameters used all at once. You can read all about here https://arxiv.org/abs/2405.04434