• linearchaos@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    2
    ·
    4 months ago

    The coolest and most frightening thing about all that is the number of books they train the models on are immense, but the model data is very tiny comparatively. And while the compression is amazingly lossy it still has an amazing amount of the data in there.

    To nvidas credit, The training models do not contain the contents of the books, but they can still tell you intimate details about the books without it being able to provide a photographic reproduction of everything in the book.

    We’ve literally created something that can analyze books in the same way that we read them and retain the same lossy levels of information. That’s honestly pretty f****** amazing.

    Obviously intellectual property laws aren’t designed for this. Hell even our concept of intellectual property isn’t designed for this. If this was a corporation that hired a thousand people to read a bunch of books and be on tap for queries about the information in those books nobody would complain. One copy of each book purchased would be enough to cover the intellectual property restrictions for this.

    Also obviously this isn’t what happened and people see money lying on the table.