"Trained on" is doing a lot of work in many services. Foundation models are "trai...

"Trained on" is doing a lot of work in many services.

Foundation models are "trained" on a huge amount of data. To imitate a person a service is then not "trained on" someone's text messages for example. The text messages are not enough to create a passable output from a computer.

The text messages are an input to an already trained model, where the original training material is just not disclosed.

I feel like "trained on" is often dangerously misleading when it refers to a tool built on top of a foundation model. It doesn't really acknowledge that any such service that will "imitate" a person also has thousands if not millions of other human voices as part of the computations it draws from.

And actually "trained on" is misleading in many other ways but I'm out of breath 😅

Anyway, I was listening to @404mediaco's episode on the "AI" avatar of a killed man that "testified" in court. Worth a listen.

https://pca.st/episode/8bc12731-f6db-47a9-a800-14916f1cce3d