Here's what happens when machine learning needs vast amounts of data to build statistical mod...

Here's what happens when machine learning needs vast amounts of data to build statistical models for responses. Historical, debunked data makes it into the models and is preferred by the model output. There is much more outdated, harmful information published than there is updated, correct information. Hence statistically more viable.

"In some cases, they appeared to reinforce long-held false beliefs about biological differences between Black and white people that experts have spent years trying to eradicate from medical institutions."

In this regard the tools don't take us to the future, but to the past.

No, you should never use language models for health advice. But there are many people arguing for exactly this to happen. I also believe these types of harmful biases make it into more machine learning applications than language models specifically.

In libraries across the world using the Dewey Decimal System (138 countries), LGBTI (lesbian, gay, bisexual, transgender and intersex) topics have throughout the 20th century variously been assigned to categories such as Abnormal Psychology, Perversion, Derangement, as a Social Problem and even as Medical Disorders.

Of course many of these historical biases are part of the source material used to make today's "intelligent" machines - bringing with them the risk of eradicating decades of progress.

It's important to understand how large language models work if you are going to use them. The way they have been released into the world means there are many people (including powerful decision-makers) with faulty expectations and a poor understanding of what they are using.

https://www.nature.com/articles/s41746-023-00939-z

#DigitalEthics #AIEthics