comment by kleinbl00

a thoughtful web.

Good ideas and conversation. No ads, no tracking. Login or Take a Tour!

kleinbl00 · 8 days ago · link · · parent · post: A small number of samples can poison LLMs of any size

This study represents the largest data poisoning investigation to date and reveals a concerning finding: poisoning attacks require a near-constant number of documents regardless of model size. In our experimental setup with models up to 13B parameters, just 250 malicious documents (roughly 420k tokens, representing 0.00016% of total training tokens) were sufficient to successfully backdoor models.

So never mind "poisoning" - what this study says (and the bulk of the researchers don't work for Anthropic) is that the data integrity of (420,000 / 0.0000016 = 40 billion words) = eight Wikipedias can be compromised by (420,000 x 4/5ths = 336,000 words) = any one of the extant Game of Thrones novels or any two Harry Potter books.

How much of the available training data out there treats fan death credibly, for example?

markup tips · 0