• henfredemars@infosec.pub
    link
    fedilink
    English
    arrow-up
    19
    ·
    edit-2
    9 months ago

    My quick lazy manual transcription:

    What data was used to train Sora?
    We used publicly available data and licensed data.

    So, videos on YouTube?
    I’m actually not sure about that.

    OK, videos from Facebook? Instagram?
    You know if they were publicly available, um yeah, publicly available to use there might be the data but I’m not sure. I’m not confident about it.

    What about Shutterstock? I know you guys have a deal with them.
    I’m just not gonna go into the details of the data that was used but it was publicly available or licensed data.

    EDIT: Please help, can’t figure out how preserve line breaks. Edit: Improved it a bit.