Posts from Drupal site between approx 2008-2021

I was trying to draw broader parallels between what I was finding for language, and the impact of complexity in a broader range of domains.

Original Drupal Site Design

The original site emphasized tag-based navigation and connections between ideas:

Original Drupal Site Layout

The tag system was central to the site's organization, reflecting the theoretical premise of multiple ways to organize complex information:

Drupal Tag System

Theoretical Outline

Linguistics attempts to find a system for language.

In common with broader science this search has moved in the direction of seeking general laws, or rules, broadly speaking grammar.

The conjecture of this website is that such rules cannot be abstracted for natural language.

Looked at more broadly this seems to be part of a general revision of the place of theory in science. Godel’s mathematical incompleteness appears to be a version of it. Stephen Wolfram’s “computational irreducibility” appears to be a version of it.

At root though, the idea does not require sophisticated mathematics or physics to grasp. It can be as simple as the intuition there might be more ways of arranging a set of objects meaningfully, than there are objects themselves.

The claim sounds innocuous enough, and unlikely to hold much interest outside of mathematics. But it has significant practical consequences.

In particular from the point of view of natural language it implies that there may be more regularities, and thus apparent rules, among the sentences of a natural language, than there are sentences in the language. Hence that natural language may be in a very precise way “irregular”. More exactly that it will need to be modeled as an automaton and not a set of rules.

An exploration from the point of view of language has revealed similar ideas in a diverse range of disciplines, including mathematics, and cognition.

The unifying thread might be summarized as “complex” or chaotic behaviour. It is at least conjectured that these ideas are related to the very young study of “complex” or chaotic systems.

We are only beginning to understand the implications of such systems. In fact they seem to go very deep and redefine even our usual conception of science itself as the regular expression of principles.

The aim of this site is to collect together a range of such ideas from a number of fields in the hope that researchers might be encouraged to compare notes, and that linguists in particular might be led to rethink some traditional problems which have fractured the subject into distinct “schools” (either rejecting structure or rejecting distributional analysis) and broadly prevented progress.

The intention is that users should access the site by clicking on “tags”, listed left, which reflect their interest, and then follow tags within the articles linked to, revealing hidden connections. It is the tags which pull the site together, as befits a site of which the basic premise is that there is fundamentally more than one way of organizing any (sufficiently complex) set of information.

theory linguistics complexity chaos grammar incompleteness computational-irreducibility

Old Introduction

The central idea of this website can be reduced to the assertion that you can find more regularities among a set of objects (if it is sufficiently complex/irregular) than there are objects in the set.

The claim sounds innocuous enough, and unlikely to hold much interest outside of mathematics. But it has significant practical consequences.

In particular from the point of view of natural language it implies that there may be more regularities, and thus apparent rules, among the sentences of a natural language, than there are sentences in the language. Hence that natural language may be in a very precise way “irregular”. More exactly that it will need to be modeled as an automaton and not a set of rules.

An exploration from the point of view of language has revealed similar ideas in a diverse range of disciplines, including mathematics, and cognition.

The unifying thread might be summarized as “complex” or chaotic behaviour. It is at least conjectured that these ideas are related to the very young study of “complex” or chaotic systems.

We are only beginning to understand the implications of such systems. In fact they seem to go very deep and redefine even our usual conception of science itself as the regular expression of principles.

The aim of this site is to collect together a range of such ideas from a number of fields in the hope that researchers might be encouraged to compare notes, and that linguists in particular might be led to rethink some traditional problems which have fractured the subject into distinct “schools” (either rejecting structure or rejecting distributional analysis) and broadly prevented progress.

The intention is that users should access the site by clicking on “tags”, listed left, which reflect their interest, and then follow tags within the articles linked to, revealing hidden connections. It is the tags which pull the site together, as befits a site of which the basic premise is that there is fundamentally more than one way of organizing any (sufficiently complex) set of information.

theory linguistics complexity chaos regularities automaton

Wittgenstein's games

“Consider for example the proceedings that we call games'. I mean board games, card games, ball games, Olympic games, and so on. What is common to them all? Don't say, "There must be something common, or they would not be called games’ " - but look and see whether there is anything common to all. For if you look at them you will not see something common to all, but similarities, relationships, and a whole series of them at that. To repeat: don’t think, but look! Look for example at board games, with their multifarious relationships. Now pass to card games; here you find many correspondences with the first group, but many common features drop out, and others appear. When we pass next to ball games, much that is common is retained, but much is lost. Are they all `amusing’? Compare chess with noughts and crosses. Or is there always winning and losing, or competition between players? Think of patience. In ball games there is winning and losing; but when a child throws his ball at the wall and catches it again, this feature has disappeared. Look at the parts played by skill and luck; and at the difference between skill in chess and skill in tennis. Think now of games like ring-a-ring-a-roses; here is the element of amusement, but how many other characteristic features have disappeared! And we can go through the many, many other groups of games in the same way; can see how similarities crop up and disappear. And the result of this examination is: we see a complicated network of similarities overlapping and criss-crossing: sometimes overall similarities, sometimes similarities of detail.”

http://en.wikipedia.org/wiki/Prototype_Theory

philosophy linguistics wittgenstein

Gosper's glider gun

“Conway conjectured on the existence of infinitely growing patterns, and offered a reward for an example. Gosper was the first to find such a pattern (specifically, the Glider gun), and won the prize.”

http://en.wikipedia.org/wiki/Bill_Gosper

cellular-automata complexity mathematics

Thomas Kuhn

Thomas Kuhn, The Structure of Scientific Revolutions

p.g. 192 (Postscript) “When I speak of knowledge embedded in shared exemplars, I am not referring to a mode of knowing that is less systematic or less analyzable than knowledge embedded in rules, laws, or criteria of identification. Instead I have in mind a manner of knowing which is misconstrued if reconstructed in terms of rules that are first abstracted from exemplars and thereafter function in their stead.”

philosophy kuhn

Lamb's 'non-linearity' vs. Chomsky's 'loss of generality'

Response by Sydney Lamb to a post by me asking about Chomsky’s objections to the abstraction of phonemes from language data. Funknet discussion list:

The particular analysis which interests me is one I found in a historical retrospective by Fritz Newmeyer and others “Chomsky’s 1962 programme for linguistics” (in Newmeyer’s “Generative Linguistics – A Historical Perspective”, Routledge, 1996, and apparently also published in “Proc. of the XVth International Congress of Linguists”.)

Newmeyer is talking mostly about Chomsky’s “Logical basis of linguistic theory” paper (presented at the Ninth Int. Congress of Linguists?) Chomsky’s argument as he presents it focused largely on phonology, and was controversial because it attacked what was at the time “considered a fundamental scientific insight: the centrality of the contrastive function of linguistic elements.” …According to Newmeyer “part of the discussion of phonology in ‘LBLT’ is directed towards showing that the conditions that were supposed to define a phonemic representation (including complementary distribution, locally determined biuniqueness, linearity, etc.) were inconsistent or incoherent in some cases and led to (or at least allowed) absurd analyses in others.” Most importantly the interposition of such a “phonemic level … led to a loss of generality in the formulation of the rule-governed regularities of the language.” …

Chomsky was correct in pointing out that some of the criteria in use at that time for defining phonemic representations were less than airtight, but his alternative phonological proposals were even more faulty. I analyzed every one of his arguments against the “classical phonemic level” (e.g. the Russian obstruents, the English vowel length difference before voiced vs. voiceless syllable-final consonants) and found flaws in every one, some of them rather egregious. Conclusion: His arguments about “loss of generality” are wrong – every one of them.

For example, perhaps his most celebrated argument concerns the Russian obstruents. He correctly pointed out that the usual solution incorporates a loss of generality, but he misdiagnosed the problem. The problem was the criterion of linearity. He stubbornly holds on to this criterion, although it really is faulty, and comes up with a solution for the Russian obstruents that obscures the phonological structure. I showed (in accounts cited below) that by relaxing the linearity requirement we get an elegant solution while preserving “centrality of contrastive function of linguistic elements”.

The errors in Chomsky’s arguments (together with defense of “centrality of contrastive function of linguistic elements”) have been pointed out it a number of publications, including:

Lamb, review of Chomsky …. American Anthropologist 69.411-415 (1967).

Lamb, Prolegomena to a theory of phonology. Language 42.536-573 (1966) (includes analysis of the Russian obstruents question, as well as a more reasonable critique of the criteria of classical phonemics).

Lamb and Vanderslice, On thrashing classical phonemics. LACUS Forum 2.154-163 (1976).

See also the discussion in Lamb, Linguistics to the beat of a different drummer. First Person Singular III. Benjamins, 1998 (reprinted in Language and Reality, Continuum, 2004).

linguistics phonology chomsky lamb

Nativelike selection

Work like the classic seminal work of Pawley and Syder demonstrate natural language is far from random, but is equally far from regular:

p.g. 2 “The problem we are addressing is that native speakers do not exercise the creative potential of syntactic rules to anything like their full extent, and that, indeed, if they did do so they would not be accepted as exhibiting nativelike control of the language. The fact is that only a small proportion of the total set of grammatical sentences are nativelike in form - in the sense of being readily acceptable to native informants as ordinary, natural forms of expression, in contrast to expressions that are grammatical but are judged to be ‘unidiomatic’, ‘odd’ or ‘foreignisms’.”

Pawley, A. and H. Syder. 1983. Two puzzles for linguistic theory: nativelike selection and nativelike fluency. In Language and communication, eds. J. Richards and R. Schmidt, London: Longman, pp. 191–226.

http://www.linguistics.uwa.edu.au/__data/page/74645/PawleySyder.pdf

linguistics

Hutter Prize - Incompressibility of text

The Hutter Prize reflects that we cannot compress natural language by as much as we would expect:

“…in 1950, Claude Shannon estimated the entropy (compression limit) of written English to be about 1 bit per character [3]. To date, no compression program has achieved this level.” (http://cs.fit.edu/~mmahoney/compression/rationale.html)

It is inspired by the work of Marcus Hutter to show that compression can be used as a functional definition for intelligence.

The idea of the prize is the old one that we (can’t predict, and thus compress, text as much as we expect because we…) need human knowledge to “disambiguate” natural language. That’s an old idea. I believe almost the opposite. But the prize, and the work of Marcus Hutter which motivated it, is interesting for what it says about the predictability of natural language, and in particular the “randomness” of meaning. Where by the “randomness of meaning” I mean that Hutter’s work (like Schmidhuber’s “New AI”) assumes it is necessary to use probabilistic model of intelligence.

It is also a definition of intelligence dependent on goals, note (c.f. W. J. Freeman). Hutter: “No Intelligence without Goals.”

Hutter has written a book on this:

Universal Artificial Intelligence - Sequential Decisions based on Algorithmic Probability

http://www.hutter1.net/ai/uaibook.htm

There’s an annual Euro 50,000 prize for the best effort. (http://prize.hutter1.net/)

compression intelligence linguistics

Walter J. Freeman - Observable chaos in EEG studies of the brain

"Our studies have led us as well to the discovery in the brain of chaos- complex behavior that seems random but actually has some hidden order. The chaos is evident in the tendency of vast collections of neurons to shift abruptly and simultaneously from one complex activity pattern to another in response to the smallest of inputs.

This changeability is a prime characteristic of many chaotic systems. It is not harmful in the brain. In fact, we propose it is the very property that makes perception possible. We also speculate that chaos underlies the ability of the brain to respond flexibly to the outside world and to generate novel activity patterns, including those that are experienced as fresh ideas."

(http://sulcus.berkeley.edu/FLM/MS/Physio.Percept.html)

From a discussion on the Ontolog forum (http://ontolog.cim3.net/forum/ontolog-forum/2008-02/msg00053.html):

“What I find most exciting is the way chaos acts as an enormous well of new structure. Lack or order is not necessarily a bad thing. Rather it can be thought of as freeing you to create new structure (if something is ordered its structure is bounded.)

For instance, W. J. Freeman’s work makes much of the way an organism seems to continuously fold experience into new configurations which represent new perspectives on a stimulus. He makes a big point of there being no ‘meaning’ for an organism independent of intent.

E.g. ‘Perceived time differs from world time in ways that are determined by the neural mechanisms of intentionality.’

(Perception of Time and Causation Through the Kinesthesia of Intentional Action, 2008. http://www.springerlink.com/content/u25052w300187661/)

So the great thing is you get all this new structure (liberated by disorder.) From a meaning representation point of view this new structure could be thought of as constantly letting you find new relationships among data. From a linguistic point the new structure might be thought of as constraining what what can be said (to explain the detail of collocation, etc.)”

chaos neuroscience intelligence complexity

Kuhn on Wittgenstein

“What need we know, Wittgenstein asked, in order that we apply terms like ‘chair’, or ’leaf’, or ‘game’ unequivocally and without provoking argument?

That question is very old and has generally been answered by saying that we must know, consciously or intuitively, what a chair, or a leaf, or game is. We must, that is, grasp some set of attributes that all games and only games have in common. Wittgenstein, however, concluded that, given the way we use language and the sort of world to which we apply it, there need be no such set of characteristics. Though a discussion of some of the attributes shared by a number of games or chairs or leaves often helps us learn how to employ the corresponding term, there is no set of characteristics that is simultaneously applicable to all members of the class and to them alone. Instead, confronted with a previously unobserved activity, we apply the term ‘game’ because what we are seeing bears a close “family resemblance” to a number of the activities that we have previously learned to call by that name. For Wittgenstein, in short, games, and chairs, and leaves are natural families, each constituted by a network of overlapping and crisscross resemblances. The existence of such a network sufficiently accounts for our success in identifying the corresponding object or activity.”

Notice how Wittgenstein’s definition of meaning is fundamentally a set.

Note also that by his definition there is no single sufficient set, only many, mutually contradictory sets.

If you’re interested in exploring such “set theoretic” ideas about meaning I strongly recommend the whole of Kuhn’s book. Of course I knew Kuhn was famous for proposing scientific progress/knowledge was discontinuous (and partially subjective), but I never realized how much he had to say about the nature of knowledge itself. In fact he defines knowledge fundamentally as sets of examples. This equivalence between sets of examples, the original sense of “paradigm”, and knowledge, is where his famous use of the word “paradigm” in the sense of “word view” or “scientific theory” comes from.

Note also that (I would argue) attempts to base mathematics in set theory 100 or so years ago were not unrelated to an interpretation of “meaning” in terms of sets.

From a discussion entitled “About enwik and AI” on comp.compression, Jan. 2007 (http://newsgroups.derkeiler.com/Archive/Comp/comp.compression/2007-01/msg00018.html.)

philosophy linguistics kuhn wittgenstein

"Games", Conway, names, and meaning

“One of the most brilliant mathematicians of the last and current century is John Horton Conway. Near the middle of the last century he formalized a notion of game in terms of a certain recursive data structure. He went on to show that every notion of number that has made it into the canon of numerical notions could be given representations in terms of this data structure. These ideas are documented in his delightful On Numbers and Games. Knuth popularized some of these ideas in his writings on surreal numbers.”

http://biosimilarity.blogspot.com/2008/03/naming-as-dialectic.html

mathematics conway games meaning naming

Grammar: formally incomplete or just random?

Natural language appears to be random (c.f. from the Hutter Prize page):

“…in 1950, Claude Shannon estimated the entropy (compression limit) of written English to be about 1 bit per character [3]. To date, no compression program has achieved this level.” (http://cs.fit.edu/~mmahoney/compression/rationale.html)

The most successful contemporary natural language technologies are probabilistic.

The usual explanation is that something external selects between alternatives which are equally probable on linguistic grounds. Commonly this external factor is assumed to be “meaning”.

Noam Chomsky in the 1950’s convinced a generation this external factor was an innate “language organ”.

We now know that incomplete formal systems show properties of randomness.

linguistics randomness compression

Randomness and cellular automata

Rudy Rucker on cellular automata: “I was first hooked on modern cellular automata by [Wolfram84]. In this article, Wolfram suggested that many physical processes that seem random are in fact the deterministic outcome of computations that are simply so convoluted that they cannot be compressed into shorter form and predicted in advance. He spoke of these computations as “incompressible,” and cited cellular automata as good examples.”

http://www.fourmilab.ch/cellab/manual/chap5.html

cellular-automata randomness compression

Relationship of Gödel's incompleteness theorem to uncertainty principles and randomness

From Heisenberg to Gödel via Chaitin Authors: C.S. Calude, M.A. Stay (Submitted on 26 Feb 2004 (v1), last revised 11 Jul 2006 (this version, v6))

Abstract: In 1927 Heisenberg discovered that the “more precisely the position is determined, the less precisely the momentum is known in this instant, and vice versa”. Four years later Gödel showed that a finitely specified, consistent formal system which is large enough to include arithmetic is incomplete. As both results express some kind of impossibility it is natural to ask whether there is any relation between them, and, indeed, this question has been repeatedly asked for a long time. The main interest seems to have been in possible implications of incompleteness to physics. In this note we will take interest in the converse implication and will offer a positive answer to the question: Does uncertainty imply incompleteness? We will show that algorithmic randomness is equivalent to a “formal uncertainty principle” which implies Chaitin’s information-theoretic incompleteness. We also show that the derived uncertainty relation, for many computers, is physical. In fact, the formal uncertainty principle applies to all systems governed by the wave equation, not just quantum waves. This fact supports the conjecture that uncertainty implies randomness not only in mathematics, but also in physics.

http://arxiv.org/abs/quant-ph/0402197

mathematics incompleteness randomness physics

Wolfgang Wildgen's "Catastrophe theoretical models in semantics"

John Sowa writing on the Ontolog discussion list 2007:

“René Thom, who founded catastrophe theory and received the Fields Medal for his efforts, was firmly convinced that all areas of human perception and cognition – language, in particular – depend on features that are closely related to catastrophe theory.

Wolfgang Wildgen is a linguist who developed Thom’s ideas on catastrophe theory applied to of semantics. For various papers and PowerPoint slides (in English, German, and French) see the list on his website:

http://www.fb10.uni-bremen.de/lehrpersonal/wildgen.aspx

Wildgen also published several books in English that present Thom’s ideas and Wildgen’s extensions in some detail.”

catastrophe-theory semantics linguistics wildgen thom

The site is expanding

The format is changing from a narrow focus on parsing to a broader consideration of language and meaning, and the relationship of both to complexity theory.

I hope the site retains the terse, aphoristic style. And I hope the content remains useful to the reader.

meta complexity language

Determinate system with random statistics

From a thread on the comp.compression discussion list, 2007:

I think the insight we have been looking for to model AI/NLP is that the information needed to code different ways of ordering a system (knowledge) is always greater than the information to code the system itself (for a random system.)

In the context of AI/NLP it is important to note that random need not mean indeterminate. I hope I demonstrated this in our earlier thread on comp.compression. In the case of language relational distributions can precisely determine a syntax which nevertheless has random statistics. To be clear I am talking once again about our toy example: “AX…DX…DB…AZ…YZ…YC…A_”. Here different relational distributions (A,D), (A,Y) (corresponding to “knowledge”: in some ways “A is close to D but not Y” but in other ways “A is close to Y but not D”?) deterministically specify syntax, but still produce a random language model A_ -> AB/AC (probably because the distributions are contradictory.)

These distributions specify the system, but it is more expensive to enumerate all the different possible relational distributions than it is to enumerate the system itself. (So they can’t be used to compress it.)

AI NLP randomness complexity compression

Number of predictive classes you can find in text

From a discussion on the “Hutter-Prize” Google group:

…If A_1, A_2,… A_n are the contexts of A in some text, and X_1, X_2,…X_n are contexts of other tokens, then the number of ways A can have common contexts with other tokens in the text, and thus uniquely specify some new paradigmatic class, are just Matt’s “(n choose k) = n!/(k!(n-k)!) possible sets”, where k is the number of common contexts between A and some other token.

The syntagmatic distribution of sequences AX_? specified by these classes in the text can be random, because many different paradigmatic distributions (A_i,…A_i+k) can be equally likely (and must be, because many of the “n choose k” possible classes will overlap, and thus form complementary distributions with each other??) But the relationship between any given syntax and its corresponding paradigmatic distribution is not random. And the different paradigmatic distributions (knowledge?) governing that random syntax are not random either, just very numerous. Much more numerous than the sequence of 2n or so tokens needed to specify them.

compression linguistics paradigmatic-classes hutter-prize