Is language inefficient?

NOTE HXA7241 2021-02-14T08:28Z

The limit on GPT-3's understanding is not the efficiency of language in expressing knowledge.

——

Consider this explanation:

“GPT-3 clearly understands some things. But it's pretty shallow, and narrowly limited to knowledge that has been easily expressed in language. The vast majority of human knowledge cannot be efficiently expressed through language.”

‒ https://twitter.com/ylecun/status/1324004518888198145

Is language inefficient at expressing knowledge? This is not so obvious.

An example advanced might be ‘tacit-knowledge’, like how to ride a bike. But is that knowledge? How? It is not that one wants to deny it, but bike riding is a vague, complicated set of movements, and is not knowledge something we want to think of as info: discrete symbols for discrete changes?

Also, riding a bike seems difficult. Look at the complexity of systems that can do it. You would expect that communicating that would take a lot of info, so efficiency is going to be a subtler measure than merely being ‘difficult to say concisely’.

But can we not say it concisely? You could tell someone how to ride a bike very simply: get on, pedal, and keep practicing until you can. That would work! By any measure, that must be a very efficient imparting of knowledge. You could not produce a bike-riding robot like that.

It looks like we have to retreat to a crude low-level definition of info, and say that it all really depends on what changes are signalled. We would get a number like: language is 62% efficient ‒ which does not really say anything very interesting.

You can discern the issue here by examining ‘expressing’. Expressing must assume a particular material or recipient, so complexity sort-of cancels out. You could send a very simple signal to a complex machine ‒ just a single bit on/off ‒ and it can do something complex on that simple signal. Knowledge is entirely contextual: a bit-signal's meaning (and efficiency) entirely depends on the receiver.

So this seems to be not really about language or its deficiencies. Language is fine. It cannot be a problem with language, because how would we know that GPT-3 does not know as much as us? Not only would GPT-3 not be able to get knowledge from language, we would not be able to judge it through its communication back to us.

Language, like info, is substantially defined by the material it talks about. So the issue here really devolves to a question not of language efficiency, but of the agent capability/intelligence handling the language.

——

Suggesting that language cannot efficiently express some knowledge is really just a sliding-scale form of saying that some things are unsayable. But what can language not capture? Are you saying that language cannot quite get at some things? That it cannot quite represent them? But look at language: it is just marks on a page ‒ it is nothing like anything it talks about! On this complaint, language cannot represent anything. So the vague feeling that some things are unreachable by language (or language is inefficient) is confused. Language is info: info simply records differences, so anything that makes any difference to us, can be captured by language.

You could say that an algorithm (that might be expressed in the language) does not well model an object or phenomenon. But this is a deficiency of that particular algorithm/model, not of the language medium.

——

Related: