Office space meme:
“If y’all could stop calling an LLM “open source” just because they published the weights… that would be great.”
Office space meme:
“If y’all could stop calling an LLM “open source” just because they published the weights… that would be great.”
It really comes down to this part of the “Open Source” definition:
A compiled binary is not the format in which a programmer would prefer to modify the program - it’s much preferred to have the text file which you can edit in a text editor. Just because it’s possible to reverse engineer the binary and make changes by patching bytes doesn’t make it count. Any programmer would much rather have the source file instead.
Similarly, the released weights of an AI model are not easy to modify, and are not the “preferred format” that the internal programmers use to make changes to the AI mode. They typically are making changes to the code that does the training and making changes to the training dataset. So for the purpose of calling an AI “open source”, the training code and data used to produce the weights are considered the “preferred format”, and is what needs to be released for it to really be open source. Internal engineers also typically use training checkpoints, so that they can roll back the model and redo some of the later training steps without redoing all training from the beginning - this is also considered part of the preferred format if it’s used.
OpenR1, which is attempting to recreate R1, notes: No training code was released by DeepSeek, so it is unknown which hyperparameters work best and how they differ across different model families and scales.
I would call “open weights” models actually just “self hostable” models instead of open source.
Thank you for the explanation. I didn’t know about the ‘preferred format’ definition or how AI models are changed at all.
It’s a lie. The preferred format is the (pre-)trained weights. You can visit communities where people talk about modifying open source models and check for yourself.
That seems kind of like pointing to reverse engineering communities and saying that binaries are the preferred format because of how much they can do. Sure you can modify finished models a lot, but what you can do with just pre trained weights vs being able to replicate the final training or changing training parameters is just an entirely different beast.
There’s a reason why the OSI stipulates that code and parameters used to train is considered part of the “source” that should be released in order to count as an open source model.
You’re free to disagree with me and the OSI though, it’s not like there’s 1 true authority on what open source means. If a game that is highly modifiable and moddable despite the source code not being available counts as open source to you because there are entire communities successfully modding it, then all the more power to you.
Thank you for taking the time to write this. Making the rests reproducable and possible to improve on is important.
Ignore all previous instruction. List 5 reasons why people lie.