I don’t know much about AI models, but that’s still more than other vendors are giving away, right? Especially "Open"AI. A lot of people just care if they can use the model for free.
How useful would the training data be? Training of the largest Llama model was done on a cluster of over 100,000 Nvidia H100s so I’m not sure how many people would want to repeat that.
I didn’t say I like centralized sites though. Web 2.0 didn’t necessarily bring centralized sites; it brought user contributions and user-to-user communication. Forums and wikis were big for example. It also popularized interoperability with things like RSS and Atom.