CommunityNews

CommunityNews

Playing games with AIs: The limits of GPT-3 and similar large language models

Playing Games with Ais: The Limits of GPT-3 and Similar Large Language Models - Minds and Machines.
This article contributes to the debate around the abilities of large language models such as GPT-3, dealing with: firstly, evaluating how well GPT does in the Turing Test, secondly the limits of such models, especially their tendency to generate falsehoods, and thirdly the social consequences of the problems these models have with truth-telling. We start by formalising the recently proposed notion of reversible questions, which Floridi & Chiriatti (2020) propose allow one to ‘identify the nature of the source of their answers’, as a probabilistic measure based on Item Response Theory from psychometrics. Following a critical assessment of the methodology which led previous scholars to dismiss GPT’s abilities, we argue against claims that GPT-3 completely lacks semantic ability. Using ideas of compression, priming, distributional semantics and semantic webs we offer our own theory of the limits of large language models like GPT-3, and argue that GPT can competently engage in various semantic tasks. The real reason GPT’s answers seem senseless being that truth-telling is not amongst them. We claim that these kinds of models cannot be forced into producing only true continuation, but rather to maximise their objective function they strategize to be plausible instead of truthful. This, we moreover claim, can hijack our intuitive capacity to evaluate the accuracy of its outputs. Finally, we show how this analysis predicts that a widespread adoption of language generators as tools for writing could result in permanent pollution of our informational ecosystem with massive amounts of very plausible but often untrue texts.

Read in full here:

This thread was posted by one of our members via one of our news source trackers.

Where Next?

Popular General Dev topics Top

First poster: HenryCost
I wired my tree with 500 LED lights and calculated their 3D coordinates… If you support me on Patreon at any point in December 2020 I wi...
New
First poster: bot
FUZIX FUZIX is a fusion of various elements from the assorted UZI forks and branches beaten together into some kind of semi-coherent pla...
New
First poster: dyowee
Everyone seems to be striving for ‘clean’ code at the moment. You can’t read a blog post without the author telling you how clean their a...
New
First poster: bot
Raspberry Pi security alarm — the basics. In November last year — I started building a DIY security alarm system, using a Raspberry Pi a...
New
First poster: bot
Large Language Models like ChatGPT say The Darnedest Things. The Errors They MakeWhy We Need to Document Them, and What We Have Decided ...
New
First poster: bot
sqlglot/python_sql_engine.md at main · tobymao/sqlglot. Python SQL Parser and Transpiler. Contribute to tobymao/sqlglot development by c...
New
First poster: bot
Hector Martin (@marcan@treehouse.systems). Attached: 1 image For those wondering why the hell we need all this safety system stuff for...
New
First poster: adamaiken89
Why Ruby on Rails still matters. An old tool endures in a Next.js world
New
First poster: AstonJ
Truly independent web browser. Contribute to LadybirdBrowser/ladybird development by creating an account on GitHub.
New
New

Other popular topics Top

Devtalk
Hello Devtalk World! Please let us know a little about who you are and where you’re from :nerd_face:
New
AstonJ
If it’s a mechanical keyboard, which switches do you have? Would you recommend it? Why? What will your next keyboard be? Pics always w...
New
Exadra37
Please tell us what is your preferred monitor setup for programming(not gaming) and why you have chosen it. Does your monitor have eye p...
New
AstonJ
I’ve been hearing quite a lot of comments relating to the sound of a keyboard, with one of the most desirable of these called ‘thock’, he...
New
AstonJ
I ended up cancelling my Moonlander order as I think it’s just going to be a bit too bulky for me. I think the Planck and the Preonic (o...
New
AstonJ
Do the test and post your score :nerd_face: :keyboard: If possible, please add info such as the keyboard you’re using, the layout (Qw...
New
AstonJ
If you are experiencing Rails console using 100% CPU on your dev machine, then updating your development and test gems might fix the issu...
New
AstonJ
If you want a quick and easy way to block any website on your Mac using Little Snitch simply… File > New Rule: And select Deny, O...
New
PragmaticBookshelf
Programming Ruby is the most complete book on Ruby, covering both the language itself and the standard library as well as commonly used t...
New
mindriot
Ok, well here are some thoughts and opinions on some of the ergonomic keyboards I have, I guess like mini review of each that I use enoug...
New