Tuesday, October 31, 2017

Do you need to visit Italy to write "The Merchant of Venice"?

One of the more famous twentieth-century theories of non-Stratfordian authorship of the works of Shakespeare holds that Edward de Vere, the 17th Earl of Oxford and one-time protege of Queen Elizabeth the First, wrote the works instead.

One part of that controversy that may be well suited for the discussion of contradictions is the claim that the frequent setting of Shakespeare's plays in Italy is an indication of a travelled man that had visited that region. Founder of the Oxfordian theory, J. Thomas Looney, wrote:
[The author of the Merchant of Venice, RCK] knew Italy first hand and was touched with the life and the spirit of the country. 
(Wikipedia notes that this claim had also been used to support the authorship candidacy of the Earl of Rutland and the Earl of Darby, who had also travelled the European continent.)
Oxfordian William Farina refers to Shakespeare's apparent knowledge of the Jewish ghetto, Venetian architecture and laws in The Merchant of Venice, especially the city's 'notorious Alien Statute'. 
There is evidence that De Vere lived in and traveled in Italy for over a year. He was there when writing to Lord Burghley Sept 24, 1575, though he disparages Italy and "care not ever to see it anymore" in that letter. De Vere departed Venice in March of 1576. The Venetian Inquisition received testimony of De Vere's fluency in Italian. Oxfordian Anderson argues that Oxford
... visited Venice, Padua, Milan, Genoa, Palermo, Florence, Siena and Naples, and probably passed through Messina, Mantua and Verona, all cities used as settings in Shakespeare.
In contra-indication, Shakespearian scholars have pointed out:

  • As far as The Merchant of Venice is concerned, "the play itself knows nothing about the Venetian ghetto; we get no sense of a legally separate region of Venice where Shylock must dwell" (Kenneth Gross)
  • Similarly, the setting is described as "a nonrealistic Venice" and the laws invoked by Portia as part of the "imaginary world of the play" inconsistent with then-existing legal practice (Scott McCrea)
  • The Alien Statute bears little resemblance to any Italian Law (Charles Ross). 
  • Lewes Lewknor's 1599 English translation of Gasparo Contarini's The Commonwealth and Government of Venice provides details on Venice's laws and customs that Shakespeare could have used in Othello, for example.
  • The Italian scholar John Florio, who lived in England and was consulted by Ben Johnson for Italian details for Volpone, published two books, First Fruits (1578) and Second Fruits (1591), the latter a bilingual introduction to Italian Language and and culture, which have been suggested as the origin of Italian idioms and dialogue (e.g. in The Taming of the Shrew) by Kier Elam and Jason Lawrence. 

Friday, September 29, 2017

An Early Akkadian Historical Argument

I have often felt that the initiation rite that Gilgamesh does not pass in Tablet XI of the standard Akkadian version (right after the flood narrative of Utanapishtim) of the Epic is one of the earliest historical arguments that we have (ca 1200BC, from the Library of Ashurbanipal in Nineveh). It is unclear to me whether the older Babylonian versions contained this section.

The context of the story is the challenge that the "Akkadian Noah" Utanapishtim gives to Gilgamesh to show that he is worthy of eternal life; he must not fall asleep for a week. Utanapishtim and his unnamed wife had earlier received, as an exception, immortality from Enlil, because they had overheard the secrets of the Gods; cf. L190ff.
(L198) [Utanapishtim said:] "Now then, who will convene the gods on your behalf,
that you may find the life that you are seeking!
Wait! You must not lie down for six days and seven nights."
soon as he [Gilgamesh] sat down (with his head) between his legs
sleep, like a fog, blew upon him. 
Gilgamesh falls asleep immediately, and Utanapishtim complains to his wife that Gilgamesh failed, but she asks for mercy, that he may wake up and return home. 
(L202) Utanapishtim said to his wife:
"Look there! The man, the youth who wanted (eternal) life!
Sleep, like a fog, blew over him."
His wife said to Utanapishtim the Faraway:
"Touch him, let the man awaken.
Let him return safely by the way he came.
Let him return to his land by the gate through which he left." 
Utanapishtim is concerned that Gilgamesh will deny falling asleep altogether, so he sets up a temporal trap, asking his wife to mark off the days on the wall and place fresh loaves of bread next to him.
(L208) Utanapishtim said to his wife:
"Mankind is deceptive, and will deceive you.
Come, bake loaves for him and keep setting them by his head
and draw on the wall each day that he lay down."
She baked his loaves and placed them by his head
and marked on the wall the day that he lay down.
The first loaf was dessicated,
the second stale, the third moist(?), the fourth turned white,
its ..., the fifth sprouted gray (mold), the sixth is still fresh. [...]"
One suspects that the earlier version had just marks on the wall, and the later version then fixed the fact that the marks show no temporal progression per se, but could be made together. This is independently interesting, if my suspicions are correct.
(L218) The seventh--suddenly he touched him and the man awoke.
Gilgamesh said to Utanapishtim:
"The very moment sleep was pouring over me
you touched me and alerted me!"
As Utanapishtim predicted, Gilgamesh tries to deny it by claiming that he ad been sleeping for just a moment (i.e. was not really asleep yet). But the evidence makes that story unsupportable.
(L222) Utanapishtim spoke to Gilgamesh, saying:
"Look over here, Gilgamesh, count your loaves!
You should be aware of what is marked on the wall!
Your first loaf is dessicated,
the second stale, the third moist, your fourth turned white,
its ...
the fifth sprouted gray (mold), the sixth is still fresh.
The seventh--at that instant you awoke!"
Gilgamesh said to Utanapishtim the Faraway:
"O woe! What shall I do, Utanapishtim, where shall I go!
...."
As far as the translation is concerned, the problem is the rotting away of the bread. E.A. Speiser had "soggy" instead of moist.


Colophon: Tablet XI translated by Maureen Gallery Kovacs, electronic version by Wolf Carnahan, line numbers according to ANECT, 3rd edition with Supplement, (pp.95-96).

Disagreements in Wikipedia

I have been tracking disagreements between articles in Wikipedia for a while now, and just found another nice one (though admittedly one of the articles is a stub). I found this while researching this post. The articles are about the Egyptian Pharao Dejdkare Isesi, specifically his daughters, and the man married to one of them, Senedjemib Mehi, who had finished the tomb that the previous post talked about.

While the stub-article on Senedjemib Mehi identifies his wife Khentkaus as either a daughter of Pharao Djedkare Ikesi or Unas, the successor of Djedkare Ikesi, the article on Djedkare Ikesi writes:
Less certain is the filiation of Kentkhaus III, wife of vizier Senedjemib Mehi, who bore the title of "king's daughter of his body".[75][76] It is debated whether this title indicates a true filiation or if it is only honorary.[76][77]
The fascinating part is that the quote draws upon Brovarski's 2001 work, The Senedjemib Complex, p.30 (= Fn 75),  but so does the stub-article on Mehi!

An Old Egyptian historical Argument

In the autobiographical part of his tomb inscription, Vizier Senedjemib Inti of the Fifth Dynasty of Egypt, who worked under Pharao Djedkare Isesi, writes:
When it came to pass [Register 4] his majest caused that I be annointed with fat [by the side of his majesty] [Register 5] [Neve]r [was done] the like by the side of the king for anyone. [Register 6] -- James H. Breasted, Ancient Records of Egypt, Vol 1, #270, p.122.
Given the dating of the Fifth Dynasty, that would put the record between the late 25th and the mid 24th century BC; see here for some of the proposed dates of Djedkare Isesi's reign.

Monday, August 14, 2017

Reactions to Schank

In my previous post,  I reviewed Roger Schank's Tell me a Story, New York (Scribener) 1990 (= NWU Press, 1995) as to the problem of how to construct narratives, what constituent pieces are, and especially how to challenge or critique them. Though in the previous post, I mentioned three examples of analysis by Schank, I mostly focused on Tawana Brawley and on Iran Air Flight 655 getting shot down by the USS Vincennes, leaving Canadian Olympic gold-medalist runner Ben Johnson aside.

In musing over my dialogue with Schank, I wanted to offer some preliminary stabilization of my thoughts in terms of some theses to historiography---especially in the aftermath of the political display of revisionist Southern supremacy history in Charlottesville. Some of these are issues that Schank is concerned about and some of them I do not take him to care about.

  1. Schank is right that story skeletons and their selection form an initial stance on the problem that even leads to the filtering of information or adaptation of memories. In fact, if Schank had the benefit of hindsight that we have, he could have pointed to the US Navy's explanation of scenario confirmation offered to the BBC in 2000 for the precise behavior that his story skeletons had predicted in terms of filtering incoming information.
  2. Schank correctly distinguishes between story, skeleton and gist. The story is the actual production of the human talker, targeting the listener. The skeleton is the summary of the basic telos and gives explanation patterns at hand (more of that probably in Schank's prior book on Explanation Patterns). The gist is the cake mix, to use an analogy from cooking, that can be turned into a variety of cakes that all share a distinct family resemblance. Knowledge about what my current guests like and what I have quickly at hand inform the actual execution.
  3. Schank's skeletons provide an easy ingress into his theory for content analysis. The arguments made and the supports advanced and the information elided (if we can reconstruct it) come from the telos of the skeletons that is truly most like a political stance.

  1. Because Schank does not do much with the evidence/story or fact/narrative distinction (possibly because his biases against expert systems and automated theorem proving), he misses out on the ability to postulate that facts are stories for a subculture that no longer can or wishes to treat these stories narratively. They are as if baked, and questioning them can be interpreted as a violation of the norms of the subculture. Even low level sensor readings have these stories behind them, and this holds true even in particle physics (cf Knorr-Czetina), where the individual detectors in a setup such as the CERN super-collider are attributed 'personality' (in a anaphoric sense) because of their non-interchangeable behaviors. Unlike edible cookies however, facts can be unbaked back into stories if there are anomalies, provided the research data is available. Thus, bad footnotes or page references ghost through the literature until some brave grad-student hunts them down and slays them. For many historical documents, that is not possible, and here the community finds the line drawn for it. 
  2. Schank has no locus for the social role of power in the success of narratives, because he does not distinguish the UN Security Council adequately from a US district court or the admiralty of the US Navy. Schank tries to rope this in while looking at the story expectations, but that is really a small aspect of the problem only. His use of made-up stories or divorcee self-reports is equally ill-suited to discover this, as in the first case everyone knows the story is not real (thus there are no real validity requirement or possibility) while in the second case our culture considers it flat-out rude to question people's divorce narratives. 
  3. Schank is handicapped by his focus on the genre newspaper articles and transcript of psychological experiment. He looks a bit at screenplays as well, but never clarifies how the fictional status of these works interacts with the realistic status of the newspaper article or the emotional stance of the transcribed individual.



Sunday, August 13, 2017

What about all these Stories? Rereading Schank

In the aftermath of the civil disturbances of Charlottesville, Michael Eric Dyson wrote in the August 12th, 2017 edition of the New York Times:
Such an ungainly assembly of white supremacists rides herd on political memory. Their resentment of the removal of public symbols of the Confederate past — the genesis of this weekend’s rally — is fueled by revisionist history. They fancy themselves the victims of the so-called politically correct assault on American democracy, a false narrative that helped propel Mr. Trump to victory. Each feeds on the same demented lies about race and justice that corrupt true democracy and erode real liberty.
Dyson then distinguishes between political memories, between revisionist history and other history (proper? academic? he has no name for it, but cites Du Bois), between true and false narratives.

Tools of Story Telling

It is in this context relevant to look at the infrastructure that Roger Schank has identified for story telling (bracketing Schank's occupation with intelligence, however). Because of his particular slant to story analysis, Schank provides a repertoire of narratological tools and distinctions that differ from those in the literary toolkit. (This cuts both ways, Morson argues in the Foreword, missing for example the notion of Genre in Schank's musings, p.xxff.)

Stories are received and generated structures that people label in an attempt to index them, so that they can retrieve them and match them for comparison purposes. People are assisted in story manipulation by story skeletons, which give a basic direction to a narrative, and by the hierarchy of scripts, plans, goals and themes that Schank had identified in his classical work with Robert Abelson (1977). Because stories can be retold in a variety of ways and for various purposes, Schank posits that they are remembered in a different format, the gist, of which the individual retelling is a production. Stories vary by culture and subculture---Schank specifically discusses the French restaurant on the one hand and the teenager Rock fan on the other.

In much of the exposition of the book, and especially in the discussion of the story production and generation, Schank brackets the question of truth. Schank writes as if the intentionality of the stories would make the question of whether the stories are true or not irrelevant or even impossible to answer.

Diplomatic Incidents

Schank uses a diplomatic conflict, the shooting down of an Iranian airliner by a US Navy warship, the Vincennes, on July 3, 1988, to illustrate the way that the same event is used to make hay for the divergent political positions.
If we construct our own version of truth by reliance upon skeleton stories, two people can know exactly the same facts but construct a story that relays those facts in very different ways. (p.152)
Or even more patronizingly:
The real problem in using skeletons whis way is that the storytellers usually believe what they themselves are saying. Authors construct their own reality by finding the events that fit the skeleton convenient for them to believe. They enter a storytelling situation wanting to tell a certain kind of story and only then worrying about whether || the facts fit onto the bones of the skeleton that they have previously chosen. (p.154f)
Schank then goes on to label the skeletal stories that backed the reactions of the various government as the US using the understandable tragedy (p.152f), Great Britain the justifiability of self-defense (p.153),  Libya insolence and state terrorism (p.154),  and Bahrain justifiable bad effects of war on the aggressor and moral courage (p.155).

So, we can summarize as a first impression that, while Schank will admit that some facts do not fit onto the bones of a skeleton, he is willing to talk about people's own realities and how the skeleton stories are what make up people's notions of truth.
... no matter what happens next, all the viewers of the play [that plays out in the international diplomatic incident, RCK] will retell the story according to the skeletons they have already selected; i.e. they will probably not be moved to reinterpret any new event in terms of some skeleton that they do not already have in mind. (p.158)
So even though the stories dominate to the point of where they shape the memory of the events and the way that future facts will attach to past narrations, there is still something that Schank can call coherence.
One of the oddities of story-based understanding is that people have difficulty making decisions if they know they will have trouble constructing a coherent story to explain their decision. (p.159)
So justificatory stories do have coherence requirements, otherwise they will not properly support the decisions through explanations. Schank goes on to demonstrate this with divorce stories (pp.160ff), which however need not concern us here.

Stories that were Challenged

While there is much plausibility to the fact that political agents have apriori decided how participants will respond to events in the arena that is international diplomacy---theorizing, as Sherlock Holmes would warn us, in advance of the facts (p.159), matters become more complicated when Schank looks at stories that were strongly challenged by agents able to do something about disagreement.

Schank discusses this problem in two examples, in the claimed rape of Tawana Brawley by white supremacists (pp.208ff) and the disqualification of Canadian runner Ben Johnson (pp.210ff) for allegations of doping. However, the larger context is the problem of the cultural and subcultural specificity of stories, and how to acquire that skill (pp.189-218). As a result, the analysis of these stories that were challenged (and ultimately rejected) by authoritative agents come across as situations of choosing bad raw materials for story telling on part of Brawley and Johnson.

Tawana Brawley

... Tawana Brawley ran away for four days, and she needed to tell a story. For her own reasons, she decided not to tell the true story of where she had been, but to invent one instead. (p.209)
It is easy to agree with Schank here, though one has to emphasize that he distinguishes between "true story" and "invend(ed) one" explicitly.

Schank then continues:
When we invent a story in order to mislead people, we try to figure out the story that they want to hear, and we tell it. Children frequently tell a he made me do it story or an it wasn't me that did it story when they are caught having done something wrong. And this kind of story is what Tawana Brawley told too. (p.209)
Again, we are on the same page as Shank here. But then Schank takes a departure that is interesting for the problem of reliability.
Her [Tawana Brawley's, RCK] problem was selecting a believable story. She failed to assess how many listeners would hear her story and failed to understand that what each of them wanted to hear was quite different. (p.209)
Even with the discussion of enculturation into story cultures and subcultures and the difficulties for teenagers to acquire these skills going before (p.208), the turn here is unexpected. The problem according to Schank at this point is not that too many listeners would eventually exhaust Tawana's range of fictional supports, but that they story expectations would eventually exceed her capabilities in some other way.

Schank argues that Brawley picked two basic skeletons, both shockingly realistic in 1980s America ( and as Dyson might remind us, even in 2017), namely young girl is kidnapped and raped as well as young black is victim of racial attack (p.209). Crucially for Schank's argument, the first one comes from the general American culture, and the second one from Brawley's subculture (p.209).
While either of these standard stories is bad enough, the combination of them produced something we can only assume Tawana had not counted on---the match of two types of stories || sought by the news media and black activists. (pp.209f)
Schank then discusses these two groups in turn:
The media looks for horrifying stories involving assaults on especially innocent people. Consequently, Tawana's story matched a skeleton story that news people are always looking to report. (p.210)
Her [i.e. Tawana Brawley's, RCK] story also matched a skeleton story that black activists are always on the lookout for: innocent blacks as easy victims of white officials. She had added that her rapists were state police officers and other officials of her area. The factor, of course, had it been true, should well have caused alarm on the part of the activists. (p.210)
Schank is entirely plausible in identifying the activation patterns for various groups' involvement, and the in arguing that Brawley did not understand how her chosen skeletons meshed with these activation skeletons. But even here, Schank is willing to admit that there are true stories of innocent blacks as victims---"had it been true" (p.210)---that activists should be alarmed by.
Tawana's mistake, apart from fabricating events in the first place, was to invent stories without understanding the standard nature of the stories that others look for. (p.210)
There are a polite and a critical reading to this sentence. The admission, and as an aside only, that "fabricating events in the first place" was a mistake either is Schank's backdoor to admit that the events matter (this is the polite reading) or an indication of his lack of appreciation for how much events matter (the critical reading), i.e. that Brawley's failure to grasp "the standard nature of stories" (p.210) pale to the point of irrelevance in comparison with the need to fabricate facts that could hold up to the scrutiny of multiple organizations.

It is thus with much less agreement than previously, or with a feeling of the emphasis lying on the wrong part, that we head into the description of the role of the standard nature of stories.
Thus, when doctors are handed a rape case or a case of unconsciousness from beating and deprivation, they have certain tests they perform to aid the victim. They also use skeleton stories to understand such cases, and they seek to fit the details of any new story into the familiar story. (p.210)
This  is just an adapted restating of Schank's general epistemology that understanding in humans is story-based.
Similarly, the police know a story about rape, and in the case of Tawana Brawley, they tried to fit the details of her story into theirs. (p.210)
As a result of casting the problem in this way, Schank posits that the police became suspicious because of the mismatch between Tawana's story and their expected story.
For both the doctors and the police, Tawan's story did not agree with the standard stories about rape, and this disparity caused them to question the truth of Tawana's claims. (p.210)
Notice that the whole problem of truth---now however applied to claims, not to facts or events or stories---is admitted and raised by Schank. Yet his conclusion is cast in terms of story skeletons and of subcultures:
Not surprisingly, since she is young and rather unsophisticated, Tawana Brawley failed to understand how to make up a story that matched the ones that the people who were listening to her expected to hear. She did not know the stories of the other subcultures. (p.210)
And almost wistfully, Schank concludes:
Traveling across cultures, it seems, requires the help of a translator. (p.210)
Perhaps a stronger stance that Schank could have taken was to recur on his prior notion of scripts. Tawana failed, because she did not really understand the rape script and the deprivation script, and was therefore unable to tell stories that matched well enough with the script that the doctors and the police officers had for these incidents.

Furthermore, as the New York Times article that Schank quotes (pp.208f) suggests, while the doctors and the police officers took their departure from their questioning the validity of the story, they did not condemn her on that account but used it to find actual evidence that contradicted Tawana's narrative. Notice the use of the word "evidence" in the cited article:
The conclusion that Miss Brawley fabricated her story is supported by ... evidence that she ran away and spent much of the next four days at her former apartment, evidence that she concocted the condition in which she was found, and evidence that she tried to mislead the police, doctors and others about what had happened to her.  --New York Times, September 27, 1988, cited in: Schank 1990, (p.209)
In fact, the article, a two-page spread as Schank admits (p.208), was called "Evidence points to Deceit by Brawley".

Even granted all the usual caveats of the mutual interaction between evidence and narrative, it seems hard to support the stance that for Miss Brawley, the story skeletons did her in. At best they tipped the police off to problems. And the failure in the collection of confirming evidence, which would have been required equally for a court of law, would have done her in just as much as the success in finding fabricated evidence.

Canadian Runner Ben Johnson

Schank of course believes that he has proven that Brawley became entrapped by being unfamiliar with the story expectations. Thus he can use the case of Ben Johnson, a Canadian runner disqualified from the Olympics for doping, as "another case of story misunderstanding" (p.210). 
The story is interesting in this context because the press assumed that everyone who used steroids also knows how to avoid getting caught. (p.211)
Schank's quotes from the NY Times actually do not bear this statement out. It is Dr Voy, the chief medical officer of the US Olympic Committee is the one that is quoted as commenting on the masking know-how in the athletes community.

Schank then goes on to enumerate the stories that Dr Voy, the physician and official, has---to wit: the miscalculate the dose, the screwy system, and the reckless gamble story, the panic to insure victory story and the masking the drug story (p.211)---as well as the stories that Johnson was familiar with---to wit: spiked my drink with drugs and bad lab test (p.211).

The first thing to notice is that Voy's and Johnson's narrative intentions are at odds. None of Dr Voy's stories would have gotten Johnson a pass for testing positive. Indeed, the article gives no examples of stories that Dr Voy would have considered legitimate for a positive testing candidate to continue on as a medal bearer. So Johnson could not have used any of these stories. In fact, all we learn is that the stories Johnson alleged were not part of Dr Voy's repertoire of acceptable excuses; that set may have been empty. If there was no story for Johnson to choose, then Schank's comment, that "in essence, the only thinking we have here is the selection of stories" (p.211), is puzzling in the extreme.

Summary

Though Schank at times sounds as if he is flirting with both sides of the relativism divide, this effect is produced partially by sloppy wording and partially by not distinguishing the following points correctly:
  1. Even if skeletal stories are hard to disprove because of their intentionality, which functions as a political stance does, their are notions of quality of match that can lead to not selecting them.
  2. Stories include factual information and can be challenged much more readily along that dimension. Admittedly, facts may themselves be the outcome of stories (a topic that Schank unfortunately bypasses), and facts have some of the same subcultural aspects that Schank identified for the stories (e.g. water-logged dinosaurs supporting a Biblical flood narrative) as to acceptability and validity.
  3. Though it may not be possible to show that some stories are true, it is sufficient to be able to show that some stories are false for historiography to have the ability to stem the flood of politically-charged revisionist history. Tawana Brawley's story for sure and possibly Ben Johnson's story as well (Schank's source material is too terse) are stories that are false, due to the evidence found, not the skeletons chosen.
It is possible that Schank's side-stepping of the question of truth is due to his general goal of moving AI work from theorem proving and expert systems (p.xlii) toward story generation / understanding / summarization and explanation.

If we take a step back and look at the larger sources of information available for the examples that Schank uses, such as the Admiral Fogarty report from August 1988 or the investigative research journalism by the New York Times reporters Robert McFadden et al, Outrage: The Story Behind the Tawana Brawley Hoax, from August 1990,  we get the suspicion that Schank is used in the small forms, such as the newspaper article, not the hundred-plus pages slugfest of details that Fogarty or McFadden and their collaborators provided with.

Postscriptum

In an amusing twist, the McFadden's book Outrage, published contemporaneously with Schank's book in 1990, possibly even contradicts the sequence of events claimed by Schank. 
Later, Dr [Alice, RCK] Pena [the emergency room physician, cf. p20, RCK] had her put her feet into stirrups for an examination of the vaginal, rectal and pelvic areas. The examination revealed no cuts, dried blood, bruises, swelling, deep redness, or other indications of injury. There were no signs of trauma to the mouth …. Indeed, the girl’s [Tawana Brawley’s, RCK] teeth were surprisingly clean and her mouth did not even have a bad odor. Dr Pena decided to forego the use of a rape-detection kit, at least for a while. There weren’t enough signs to warrant it …. (p.21)
At this point, with Tawana still giving signs of semi-consciousness (p.17), and had hardly launched into a story of the type that Schank is looking for. Already the initial interaction with the paramedics and the emergency room physician was gathering evidence against the narrative that Tawana was planning or had been told to use.

Bibliographic Record

Roger C. Schank, Tell Me A Story: Narrative and Intelligence,  Evanston, IL (Northwestern University Press), 1995 (= reprint of the 1990 edition published in New York (Scribner), 1990; with a foreword of Gary Saul Morson, from 1995). [Google Book Selections]

Monday, July 31, 2017

Some notes on Corpus Linguistics and their Criticism

My friend Paige Morgan, Digital Humanities Librarian at the University of Miami, recently suggested Laurence Anthony's tool AntConc for corpus linguistic analysis that is well supported and straightforward to accomplish thanks to online tutorials and YouTube videos, some of them even by the maintainer himself.

As so often, from that seed one can spiral out into similar corpus-investigating tools, such as the online Voyant Toolkit, which makes statistical properties of texts visible and accepts cut-and-pasted text for quick analysis. I am not clear yet how to use some of tools, such as Cirrus-display or the term-berry, but my example was rather short and had little in terms of repeating words.

Because Corpus Linguistics has been going on for a while now, there are a couple of good articles or even books on how to construct corpora on Amazon, mostly targeting linguists however, with a few exceptions.

There has also been criticism in the community, distinguishing corpus linguistics from discourse analysis. The recent exercise that I went through for the purposes of this blog, on Ernest Gellner's book Plow, Sword and Book, effectively plays off the discourse analytic stance against the corpus linguistic one, as I now suspect. The most recent book I was able to obtain that splits the difference comes from the analysis of academic writing.

There have been more principled attacks against corpus linguistics, for example from Noam Chomsky, for example discussed here, which see in corpus linguistics the mistaken assumption that data will eventually induce itself into a theory. (That sounds familiar ....)

There are also people who simply try to explain how corpus linguistics came about and developed, e.g. Karen Fort at INIST.fr. Fort reminds her readers of the incident reported in Hill 1962, when Chomsky overplayed his hand and denying that perform could be used with mass-nouns in English, citing himself as a native speaker as an authority. The British national Corpus revealed however that perform magic is indeed just such a construction. (I would argue that Chomsky fell into a recognition/recall trap here, overestimating the latter based on his excellence at the former. Still---bad for him.)

A more detailed review with 20th century references from the University of Lancaster can be found here. It cites British linguists like H. Widdowson, who in 2000 wrote an article entitled The Limitations of Linguistics Applied in Applied Linguistics, 21/1:3-25.
For, obviously enough, the computer can only cope with the material products of what people do when they use language. It can only analyse the textual traces of the processes whereby meaning is achieved: it cannot account for the complex interplay of linguistic and contextual factors whereby discourse is enacted. ... In reference to Hymes' components of communicative competence (Hymes 1972), we can say that corpus analysis deals with the textually attested, but not with the encoded possible, or the contextually appropriate.  (no page number provided in extract)
Though Widdowson is talking about language learning, he makes a point that bears repeating in a larger context:
... the textual product that is subjected to quantitative analysis is itself a static abstraction. The texts which are collected in a corpus have a reflected reality: they are only real because of the presupposed reality of the discourses of which they are a trace.  (no page number provided in extract)
That seems to me to be the entry point for the importance of discourse analysis. In fact, Widdowson later summarizes his point as
... corpus linguistics provides us with the description of text, not discourse. (no page number provided in extract)
The document than provides a rejoined by M Stubbs,  from 2001, entitled Texts, corpora and problems of interpretation: A response to Widdowson, published in Applied Linguistics 22/2. pp.149-172.
Corpus linguistics therefore investigates relations between frequency and typicality, and instance and norm. It aims at a theory of the typical, on the grounds that this has to be the basis of interpreting what is attested but unusual. (no page number provided in extract)
And more fully later on:
Frequency is not necessarily the same as interpretative significance: an occurrence might be significant in a text precisely because it is rare in a corpus. But unexpectedness is recognizable only against the norm. (no page number provided in extract)
Stubbs notes that this insight is especially important to note conventionality:
... [A] major finding of corpus linguistics is that pragmatic meanings, including evaluative connotations, are more frequently conventionally encoded than is often realized (Kay 1995; Moon 1998; Channell 2000). (no page number provided in extract)
Concepts of convention and norm raise problems in the not infrequent cases when interpretations diverge. (no page number provided in extract)
Stubbs cites the case of cronies in corpus-based dictionaries, but emphasizes that the analysis is made possible by the empirical aspect of corpus studies.

In a 2016 paper in Dialogic Pedagogy by Richards and Pilcher (which works with a somewhat static distinction of objective and subjective language systems) quote Ädel (2010), p.48, who worries about "the inevitable focus on surface forms in corpus work" as well as "the risk of focusing exclusively on the word and the phrase level when using computer-assisted methods" p.49.
It it was, in contrast [to the previously stated, RCK] accepted that the usage and the meaning of the language was creative [IS2], individual [IS1], and only represented the inert hardened crust of the language [IS4] then the linguist would be unable to analyse it isolated from the context in which it was used. (A128)
Except, what choice do the historians have? Voloshinov (or Mikhail Bakhtin, however the debate around Morris 1984 comes out, cf. A123) criticized the departure of linguistics from the antiquarian concerns, where
the ancient written monument [is considered] ... the ultimate realium" (Voloshinov, 1973, p.73, cited in Richards & Pilcher, A123).
Pilcher and Richards cite Bakhtin in observing:
Fundamental to the meaning of language in such a view ... are dialogue and context. The importance of dialogue (Bakhtin 1981, 1986) means that language consists of a stream of unfinished utterances that is continually evolving and is never completed. (A129)
Context is with Bakhtin referred to as a linked chain of previous utterances (A129). Pilcher and Richards mostly focus on spoken language here (e.g. the contribution of intonation to interpreting the word `well`), but some of their concerns are true in larger situations also. Citing Fecho, they write
... to expect that just because you and I are using the same term or phrase that we have a consensus understanding of its meanings is to deny that context and experience have anything to do with our understandings (Fecho, 2011, p.19; cited A130)
Corpus linguistics then generates frequency lists of
... decontextualized signifiers, which in turn are only evidence of past thoughts. (A130)
There was finally also a paper by Nelya Koteyko trying to distinguish the different forms of discourse used in science and their applicability to corpus linguistics, but I did not quite catch the main drift.

Reading with Context in Mind: Plough, Sword and Book (Part 1)

The following discussion analyses a few pages of material from Ernest Gellner's Plough, Sword and Book: The Structure of Human History, London (Collins Harville) 1988. The idea behind the exercise is to distill out in exemplary fashion the contextual form of book contents that makes some of the strategies of statistical or pattern-based NLP less helpful than one might hope.

In addressing the problem of the role that primitive man plays in modern political thought (the chapter is entitled, "Which way will the Stone Age Vote swing?"), Gellner analyzes the way in which some philosophers talk about previous social states and their impact on morality.
In between the two extremes---candid fictional reconstructionists and paid-up professional anthropologists---there are other who, while not professional specialists in the area of early man, nevertheless intend their affirmation about him [i.e. early man] to be realistic, not mere fictions, but who wish them, all the same, to point a moral for the conduct of our own social life. (p.26)
The fact that this is Gellner's analysis means that this paragraph is his own, and he owns the words in it and the thoughts as well. Thus, if Gellner, to concoct an example, were to deny ever using the word "realistic", one would be justified in pointing to this paragraph and contradicting that assertion.
For instance, one of the profoundest and most influential of prophets of modern economic and other liberalism is F. A. Hayek. Hayek's analysis of the options and perils of modern society do in fact dovetail with a sharply delineated vision of the primitive social order and ethos. On his view, the strong social morality of early man and its survival in contemporary society constitute a positive danger to us: (p.26)
This paragraph is a potential mixture; it starts out as something that Gellner says about Hayek, but toward the end it becomes possible that Gellner uses diction that is more properly considered Hayek's than Gellner's. This is because Gellner is trying to reconstruct Hayek's thought, and when people do that, they often use the words that the author to be reconstructed employs. For example, if the expression "positive danger" strikes readers as interesting, this paragraph would be ill-suited to determine if Hayek or Gellner would use that expression.

Gellner then quotes Hayek, using a block-quote with reduced font-size to indicate that this is so (p.26):
There is ... so far as present society is concerned, no "natural goodness", because with his innate instincts man could never have built up the civilisation on which the numbers of present mankind depend for their lives. To be able to do so, he had to shed many sentiments that were good for the small band, and to submit to the sacrifices which the discipline of freedom demands but which he hates. The abstract society rests on learnt rules and not on pursuing perceived desirable common objects; and wanting to do good to known people will not achieve the most for the community, but only the observation of its abstract and seemingly purposeless rules.^(Fn6) (p.26)
Notice that this paragraph needs to be attributed to Hayek---as footnote 6 tells us, here indicated by ^(Fn6), the passage is from page 20 of Hayek's work The Three Sources of Human Values, published by the London School of Economics and Political Science in London in 1978. Thus, Hayek is the one who talks about submitting to sacrifices, and the discipline of freedom, not Gellner. (Gellner may of course share that view, but we have so far not seen any textual material to support such a supposition.)

Gellner then prepares to quote another passage from Hayek, which he relates to the first passage via an editorial comment on the stance that Hayek has taken.
And, should anyone not understand what this implies in practice, let it be spelt out: (p.26)
So even though Gellner is pointing out a relationship in Hayek's thought, Gellner is at least to some extent using his own words to express that relationship and thereby give an interpretive bias to the reading of the Hayek passage.

The following quote is again distinguished by a smaller font and a blockquote presentation, marking it as a direct quote from a writing by Hayek (minus the editorial [The] that Gellner felt should be inserted).
... the long submerged innate instincts have again surged to the top. [The] demand for a just distribution in which organised power is to be used to allocate to each what he deserves is thus strictly an atavism, based on primordial emotions. And it is these || widely prevalent feelings to which prophets, moral philosophers and constructivists appeal by their plans for the deliberate creation of a new type of society.^(Fn7) (pp.26f)
The passage is from the same work, as we learn from Fn 7, but two pages earlier, p.18. Notice that this casts an odd light on Gellner's transitional phrasing, because having the implication of a thesis precede the thesis is an odd way to present an argument, and cannot really be called a spelling-out, given that most people read from the low to high page numbers.

Gellner now tries to give Hayek a strong interpretation, one that makes plausible why someone would agree to Hayek's claims. That means, we should expect to encounter Hayek's thinking presented in Gellner's words with Hayek's phrases sprinkled throughout.
The picture is striking and suggestive. Men must have lived in something like "bands", groups too small to be capable of imposing abstract and impersonal rules, for a long time---during the overwhelming majority of generations since the inception of humanity, however that inception may be dated. So, on this view, throughout most of our history our situation instilled in us an ethic which is directly opposed to all that is innovative, creative, progressive in human civilization. Hence, civilization is based on the overcoming, not so much of our lowest instincts, but on the contrary of all that had usually been held to be moral: the social impulses of mankind, the tendency to cooperate with fellows in the pursuit of shared aims. Respect for abstract and incomprehensible rules must replace love of fellow men and community and a sense of shared purpose, if civilization is to emerge and survive. (p.27)
We can see immediately that this is a paraphrase of Hayek's thinking; so far, Hayek has only spoken of  "abstract and seemingly purposeless rules", which under Gellner's reconstruction morph into "abstract and impersonal rules" or "abstract and incomprehensible rules", neither of which is far away from Hayek's meaning, but clearly not synonymous with it either.

After all, rules that are seemingly without purpose may still be personal and comprehensible---for example, the rule that after each 1000-point drop in the Dow-Jones Industrial Average, the CEOs of the Fortune 500 have to apologize on public TV and do 20 push-ups. That rule is both personal and comprehensible, just not very conducive to any purpose that we associate with economics.

It remains possible that Gellner was citing phrases from other sections of the Hayek paper that he did not quote directly; we lack the information to determine this. But we can already see that the complex interplay of commentary ("The picture is striking and suggestive"), expansion of ideas (the speculation on the small groups, with the word `bands`, a term that Hayek uses without quotation marks, now in double-quotes) and summaries of arguments ("the tendency to cooperate with fellows in the pursuit of shared aims") makes it very difficult to attribute phrases and ideas to either Hayek or Gellner directly. At the same time, I cannot shake the feeling that, if someone cited the sentence
Respect for abstract and incomprehensible rules must replace love of fellow men and community and a sense of shared purpose, if civilization is to emerge and survive. (p.27)
as Gellner's point of view, Gellner would probably protest, calling such use a case of being quoted out of context, and seeing himself as primarily elucidating Hayek---even if Hayek should object to the rules being labeled "incomprehensible".

It is important to note that Gellner strengthening Hayek's stance by paraphrasing and working to give it additional plausibility is not a fault at all, but one of Gellner's qualities as a good writer, as someone who tries to compact Hayek's writing for an audience unfamiliar with Hayek's views and takes the arguments of Hayek seriously.

Not very surprising, Gellner continues to unfold Hayek's thought, formally repeating the methods of the previous paragraph, elucidating and bringing in new notions from Hayek's extensive oeuvre.
In Hayek's vision, an unplanned, unintended culture, which was the fruit neither of conscious reason nor of animal instinct, had somehow arisen, and it alone made possible that automatic mechanism of response to need, that sustained innovative improvement, which is manifested in and fostered by the market. ... || ... Seeing how recent precarious our liberation from over-socialization is, it is surprising that we are not even more thoroughly in thrall to atavistic sociability than Hayek fears. It is a social ethic and cohesiveness, not their absence, which are our greatest threat. (pp.27f)
Even though Gellner explicitly labels the exposition as Hayek's (his "vision"), he is now bringing in terms that are not licensed by the quotes and that presumably come from the other parts of Hayek's oeuvre, key among which the notion of the market, or the notion of natural selection. (Gellner errs in not providing footnotes for the source of these concepts in Hayek's writing, but increasingly, footnotes have come to be viewed as pedantic and no longer in need of precision.)

At the same time, the paraphrasing with its subtle shifts of meaning continues; Hayek in the quotes at least never spoke of "animal instinct", but called them "innate instincts". For example, with Steven Pinker we could claim that language belongs to the innate instincts, but clearly is not a good example of an animal instinct, give how few (or none, if one follows Pinker) animals share our form of language.

Since the point of the exercise is not to understand either Hayek or Gellner's argument, just how the thoughts and the linguistic presentations of these thoughts co-occur in a specific text, we will not delve into the elided middle section of that paragraph of Gellner's (see the ellipsis in the quotation from pp.27f above) and continue straight to the next paragraph of Gellner's exposition, on page 28.
Hayek's way of presenting our general condition differs from what might be called the simplest or classical formulation of laissez faire liberalism, in his conscious stress on the cultural preconditions of an open or market society. The classical formulation suggests that its only important condition is a political one: a just, effective and unrapacious state must be present, a political authority which uses its power to keep the peace and uphold the rules, and does not use it simply to despoil civil society. Hayek's new way of defining the problem makes him insist that mere political order is not enough, that a certain kind of abstract culture is also required, the emergence of a sense of and respect for abstract rules, and a detachment from communal, cooperative ends. The Hidden Hand can operate only in a suitable cultural milieu, amongst men who are not too sociable, men who respect rules rather than social aims. (p.28)
At this point, Gellner is contrasting two stances simultaneously, what he calls the classical formulation of liberalism and Hayek's form. These two stances are distinguished, according to Gellner, in their emphasis, the one putting the accent on the political and the other on the cultural preconditions.  Because the paragraph has the potential for losing even the interested reader, Gellner has begun to use emphasis (italics) in the typography to distinguish the key terms that identify the stances. Such an emphasis would have been even more effective if he had not been forced, by convention, to also italicize the French (i.e. non-English) phrase "laissez faire" in the same paragraph.

We observe that Gellner's use of italics for emphasis is in general ambiguous to its precise meaning; on p.27, he used it to mark the introduction of important economic term of the market. That market's regulatory force, the Hidden Hand, however, receives no such emphasis and is capitalized as an agentive force instead.

Our little excursion is almost completed at this point, as we turn to the last paragraph in Gellner's consideration of early man. Gellner turns to the social theory of Hayek's one-time colleague at the London School of Economics and former compatriot, Karl Popper.
A very similar sense of the struggle, not with destructive animal instincts, but on the contrary with an oppressively social morality and the deep feelings which underline it, is also found in the social thought of Karl Popper.^(Fn8) This Hayek/Popper vision might well be called the Viennese Theory. One may well wonder whether it was not inspired by the fact that, in the nineteenth century, the individualistic, atomized, cultivated bourgeoisie of the Habsburg capital had to contend with the influx of swarms of kin-bound, collectivistic, rule-ignoring migrants from the easter marches of the Empire, from the Balkans and Galicia. Cosmopolitan liberals had to contend, in the political sphere, with the emerging breed of national socialists. This "Viennese" vision is an inversion, a denial all at once of romanticism -- it elevates || Gesellschaft (society) over Gemeinschaft (community) --- and of Marxism. Marx anticipated the restoration, rather than the overcoming, of the alleged social proclivities of early man. (pp.28f)
The literature referenced for Karl Popper in Footnote 8 is his classic The Open Society and Its Enemies, London 1945. We observe that we again encounter the use of italics to highlight terms, e,g, national socialists such as the Austro-Fascists as compared to economic socialists in the sense of the followers of Karl Marx, as well as non-English terminology, such as the German words Gesellschaft or Gemeinschaft.

In this paragraph, the reader has to pay careful attention to attribute stances properly. Not only is Gellner discussing Popper, but he is aligning Popper with Hayek and even combining them into a single theory or "vision" (cf the Viennese Theory versus the "Viennese" vision).  At the same moment, for the first time in the selection that we have been analyzing, Gellner actually engages in historiography (we will get to this problem soon), so he is reporting background information to justify his postulate of a Viennese Theory. Thus, because he is not describing the stance of the theory, but its genesis, Gellner owns this paragraph and its terminology.

To summarize: In roughly three pages, Gellner has given us an exposition of at least two intellectual stances that he may or may not share, namely those of F.A. Hayek and of the classical formulation of laissez-faire liberalism; Popper really received one sentence only and was not given a single quote. Gellner has presented these positions in quotes, which are typographically distinct, and in paraphrases, some of which make inexact use of the terminology found in the reconstructed texts. Gellner has interleaved these reconstructions which his comments and in the case of the Vienna Theory, even with an origins story that draws additional background information into the text. Notice that the brevity of the sample did not allow some other occurrences one should expect in a book the size of Gellner's, such as direct quotes of other writers' interpretations of Hayek or Popper, or paraphrases of others' reconstructions of the texts under consideration.

It is perhaps not surprising then that the typical techniques of corpus linguistics cannot succeed readily in such a setting; even though there is a sense in which Gellner's book is a corpus of conceptual stances. Perhaps the reading is not distant enough.

Sunday, July 30, 2017

Contextual Interpretation

I am trying to put together lists of examples that show why munging together large amounts of textual data can often run afoul of tricks and traps that do not assist in historical analysis, or perhaps other digital humanities as well.

  • Shifts in the Meaning of Words 
    • "mother-in-law" in Pride and Prejudice actually means the stepmother (Jack Goody, Production and Reproduction, p.53)
    • "making love" means for a man to be talking with an unmarried woman in Victorian England (e.g. Ginger Susan Frost, Promises Broken: Courtship, Class, and Gender in Victorian England, 1995, p69) with the intent of espousing her


These specific cases are instances of the discourse being not identified properly, e.g. in its mode or in its temporal delineation. But there are more detailed comments we can make about the discursive nature and the context of statements give a suitable example.

The problem of the range of the Discourse

During a discussion of couples' interactions in the New Yorker, the author reminded the reader that the range of a discourse in presidential politics and the presidential White House extends to the previous occupants and their actions as well.
On Tuesday, after Melania [Trump] appeared again to reject the President [Donald Trump], this time on the tarmac in Rome with a slick “down low, too slow” move, Pete Souza, President Obama’s official photographer, posted a photo to his Instagram account of Barack and Michelle tenderly holding hands in Selma, Alabama, a gesture that needed no interpretation.
This is an example of the kind of interaction that is difficult to track or detect without establishing the precise discourse that the item belongs to. Here models of layers of discourse that need to be attended to are crucial.

Eventually, the Washington Post made it clear at a description level, by linking to these (and other) clips and photos, providing the interpretation for those that had missed the discourse contributions. So the hope of large scale ingesting of documents for interpretation is that discourse contributions that are clever in the way that Souza's was will eventually have the kind schoolmaster who spells out what the others suspected. (In some sense, the historians often end up in that role.)

Of course, not every hand-holding couple posted that day is a commentary on the Trumps' situation, but most likely, the George W. Bushes' holding hands would have been, within a specific window of time, of course.

Appendix


Wednesday, January 25, 2017

Lishi Website

Completely forgotten that I used to work on this (here my polyptique interests took their departure).
Probably would need to request a password again at this point in time.

Tuesday, January 24, 2017

IDP Reasoning System

IDP is a reasoning system of the DTAI at the University of Leuven. There is a web-interface that can be used for experimentation. The system differs from the ASP format in specific ways, though it competes in the tri-annual ASP competitions (for a download of problems see here) It supports representations such as Abstract Dialectical Frameworks.

Some Mamluk Details and Natural History Details

Been researching history of the Mamluks for a collaboration with some people in the Digital Humanities at the Austrian Academy of Sciences.

Found some cool resources on the web:

In communicating with my collaborators, I revisited the Timbersnake argument from my dissertation, and found some additional resources for that, including

  • Mark Catesby's Natural History of  the Carolinas,Florida, the Bahamas etc etc with a cool depiction of a rattlesnake from volume 2, published in London in 1756.
  • The travels of Linneus student Peter or Pehr Kalm of Sweden, who helped classify rattlesnakes for Linneus (as did Catesby's drawings)
    • An analysis of his journal (including on rattlesnakes)
    • His three volume travel journal is available here in German for the First Part. Indubitably other parts are on Archive.org as well.