/ Vintage LLM
Un grand modèle de langage entraîné sur des textes antérieurs à 1931, en anglais, sous licence Apache 2.0 : Talkie-1930.
We chose the end of 1930 as the cutoff date because that is when works enter the public domain in the United States. For this version of the model, we also limited ourselves to primarily English-language texts, because validating the data pipeline requires deep familiarity with source documents, and we are native English speakers. But multilingual corpus expansion is a high priority, both to increase the size of the corpus and the diversity of perspectives it represents.

Oh, tu es là. Tellement content.