A new approach to AI and generated text.
Dec 12, 2019
The year of publication of Campbell’s The Hero with a Thousand Faces turns out to have been a fortuitous one, as far as my own work and personal retrospective go, anyway. The same post-war academic flurry produced Alan Turing’s landmark 1949 paper ‘Can Machines Think?’, containing the mathematician’s speculations on the capacities of Turing’s newly-advanced computational engines and, most importantly, their incapacities. Published the following year, Turing’s paper anticipates such breakthroughs as the internet, machine learning, though mainly concerns itself by defining the terms and provability of his central question. Turing’s writing is simultaneously breathtakingly prescient from our modern perspective and decidedly quaint – the paper devotes a non-trivial amount of time considering how to shield machine testing from the effects of telepathy, for example – and serves at times as an emblem of how fast the field can move in seventy years. I can, however, offer the following quote that Turing prompts from a ‘Professor Jefferson’s Lister Oration’ of 1949, which includes the following summation –
‘Not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain – that is, not only write it but know it had written it.’
The distinction between a ‘chance fall of symbols’ and a true human composition is as vexed a question for us as Turing’s definition of ‘thought’. But let’s leave Turing for a moment in our quest for machine-composed stories – not to move on, but to stay where we were for a moment.
1949 was the year that kept on giving, for it also saw the publication in book form of Claude Shannon’s A Mathematical Theory of Communication. What Shannon attempts in this slim if penetrating study was to begin to appreciate such things as language on a computational level – as a computer might understand language, digitally. For Shannon, ‘Language must be designed (or developed) with a view to the totality of things that man may wish to say; but not to be able to accomplish everything, it too should do as well as possible as often as possible […] that is to say, it should deal with its task statistically’, and ‘the significant aspect is that the actual message is one selected from a set of possible messages.’ The ruthless pragmatism in seeing that, from a computer’s perspective, word lexeme choices encode meaning absolutely through a intra- and intertextual network of dependencies. Shannon’s paper is the first also to introduce stochastic or ‘Markov’ processes – mathematical probabilities for predicting future variables from previous variables – an innovation with huge capacities within machine learning and my own research.
Subconscious cultural tides sometimes do produce these great academic algal blooms in short spans of time, but it isn’t immediately obvious how these three disparate luminaries fit together – Campbell, Turing, Shannon. The ‘thinking’ of a machine compositor implies some experiential or rule-based learning process that defies mere ‘chance’ – but surely, as we discussed, Campbell’s monomyth is far too vague and inductive to base a set of compositional rules. Even if a machine knew for certain that a ‘Cave Escape’, e.g. as certain to come next, how could the granular word choices that comprise it be fully ordained? And is composing from rote-learned formulae really ‘thinking’ at all?
Consider if you will one of the tenets from Shannon’s Information Theory – that information is more valuable the higher its degree of variance – a flashing light will attract our attention and sustain our interest more than a static light, e.g., and one that flashes irregularly even more so. Therefore, completely predictable stories are more or less worthless – as is one that is completely structurally abstract. Stories arcs of all sorts define themselves through change, choice and value exchanges, and the moral payoffs of a story service and enforce an in-built sense of reciprocity and justice that maintains human social fabric and merit their transmission from generation to generation, but likewise the stories need to vary enough from one to another to be new, engaging and applicable to the experience of its audience.
Enter Vladimir Propp, to considerable narratological fanfare. Pipping this article’s 49-ers by thirty years or so, Propp’s contributions as part of the 20th-Century Russian school of structural formalists (a group with a distinct enthusiasm for formulating narrative, natch) are still so widely promulgated to this day as to overshadow the whole field of generated text and cybernarratology. He is their patron saint. Francisco Peinado’s automatic tale generator ‘ProtoPropp’ is named for him, as is Matthew Jockers’ sentiment-mapping program ‘Syuzhet-R’ for Propp’s term for narrative discourse. The utilisation of his fundamental methodology, gleaned from the study of a huge corpus of Russian fairy tales, of the summation of any plot out of a combination of 31 possible functions was in a way a more nuanced version of Campbell’s inductive comparative mythology. Each function appears in a unique form in each story it appears in, but shares a fundamental similarity to all functions of the same type, e.g. weddings, cave escapes, scene-setting. Propp’s statement that ‘New tales will always appear merely as combinations or variations of older ones’ is one of significance, although it is immediately troubled when one must consider the indivisible units that must remain constant through this process of appropriation and transformation. Propp’s own categories also display wild variations in generalisability – e.g. the rather specific Function Pr7: ‘attempt to gnaw through a tree’ compared with the fairly broad remit of Function D7: ‘Other requests’. However, Propp’s major contribution is not the specific categories he formed from a narrow corpus, but in his methodology of figuring upon the referential text of discursive discourse a ‘deep structure’ of morphemes whose selection and ordering obey an internal grammar and whose function is universal across contexts.
As Propp failed to investigate during his lifetime, however, is the way functions depend one on another – a Cave Escape being preceded by a Cave Entrance, for example – in a single story, and obey expectations on the part of the reader that is part of the suspense of ‘what happens next’. A machine with the capacity, as Propp did by hand, to parse a huge corpus of stories and break them down into generalisable functions would also be able to give precise values of seriation of each function’s position relative to each other, as well as valuable metadata such as characters and their social relationships, tone and ontological content. These values would give a quantitative value of each function within the story (in vague, digital terms) and the capacity to generate text based on a Markov process of lexemes (text units) based on preceding content and auto-complete system a la text messages.
However, even for a machine trained to recognise the correct quantitative criteria and make a selection from a near-infinite corpus based on a finite grammar, it would still consider all stories equally valid. It has no concept of what a good or boring story is by this model, and we would very much wish to make a machine capable of discriminating as well as composing. The central question to address hereon is therefore whether there are quantifiable variables that dictate a satisfying narrative to a series of words and/or statements, such that they can be recognised, accurately judged, learned and emulated by an unsupervised artificial intelligence. Our objective will be to establish from a pan-cultural/historical perspective whether this might be so, what these variables might be, and, if artificially generated literature is even capable of being produced and consumed on any scale, hopefully ensuring that it is, above all, good. What ‘good’ means, from an objective standpoint, will form part of the discussion.