cross-posted from: https://sh.itjust.works/post/49077840
Artificial intelligence (AI) chatbots are worse at retrieving accurate information and reasoning when trained on large amounts of low-quality content, particularly if the content is popular on social media1, finds a preprint posted on arXiv on 15 October.
In data science, good-quality data need to meet certain criteria, such as being grammatically correct and understandable, says co-author Zhangyang Wang, who studies generative AI at the University of Texas at Austin. But these criteria fail to capture differences in content quality, he says.



…not just llm’s brain…