Showing posts with label reconstruction. Show all posts
Showing posts with label reconstruction. Show all posts

3 Jan 2012

Baxter-Sagart reconstructions and Occam's Razor


The internet abounds with information if we make the effort to search. One interesting find is a pdf of the Baxter-Sagart reconstruction of Old Chinese roots in tabular format. Excellent! But being an analytical bad news bear, I also see some important issues that tie in with my stance on developing orthographies that properly conform to Occam's Razor. This is out of respect for logic, for necessary simplicity, for clarity and for general readers, some of whom may not be well-versed in linguistics but which nonetheless are interested in the beauty of a language and its history.

Contempt for Occam's Razor inhabits even mainstream linguistics and the field is far too often misconceived as an intuitive art than a logical science. I put my money on organized phonologies and uncluttered orthographies that express only what's necessary for the topic at hand. It's not necessary to show exact phonetics of a word each and every time when the discussion is not about the exact phonetics of a language. If we have a list of roots, it doesn't make sense to list it all out in excruciating phonetic detail any more than it makes sense to write English this way. As such, mixing IPA symbols into your orthography often spells more trouble than what it's worth. "IPA" doesn't stand for International Orthographic Alphabet. At some point a decent linguist must come up with a sensible, legible, optimal, uncluttered orthography to express their language of study beyond the microscale phonetic level. A means, in other words, to quickly and clearly cite words in a vocabulary, pruned for immediate and sufficient comprehension by an everyday reader. Abusing symbols to complicate the message is as corrupt a practice as abusing unnecessary specialist terms for little other reason than for show.

On the top of the list, the Baxter-Sagart team begins with roots like *ʔˤra. This shows us that they envision a phonemic pharyngealized glottal stop. Fine. However unless */ʔˤ/ is phonemically distinct from other phonemes in the language, say */ʕ/, why be so precise on the orthographic level? Why not use a single clear symbol for this instead of mixing up orthography with the phonetic level far below it? If the orthography, in its necessary simplicity, doesn't make the phonetics you intend very clear, one may simply write a quick primer on it and be done with it. If only this, then I can concede that perhaps there's some reason for it that I've overlooked.

Further down the list, we also have *qˤrep which is quite the tongue-twister. One may dismiss this as within the bounds of plausibility although I do admit that this apparent pharyngealized uvular stop is unusual for its Schrödingeresque ability to inhabit two places of articulation at once. Then again, there are many consonant rich languages like Klallam around, right? We also have to keep in mind though that these kinds of languages are also quite rare and there's nothing scientific and methodical about a theory that strives towards the exotic rather than the minimal. Strong proof should come before the addition of a new phoneme to a reconstruction.

But when we come across *qʷʰˤat-, what is Baxter and Sagart trying to express to us and how does it fit into a plausible phonological system? A labialized, aspirated, pharyngealized, uvular stop??? How on earth could this possibly be contrastive with another phoneme? Surely at this point we have to concede that Baxter and Sagart have not respected the differences and proper uses of phonetic versus orthographic transcription. It gives the impression of a poorly organized phonology and orthography, mixing exact and even unlikely phonetic symbols together to create a visual mess that ends up being more confusing to the reader than helpful. At this point, it's just not reflective of the facts, even when (and especially when) armed with knowledge of the IPA system!

Keep in mind that there are already expressed concerns by others about the use of "j" in Middle Chinese onsets in words like gji  (祇 ) considering that the "phoneme" doesn't seem to exist when compared to some loanwords coming from outside Chinese (eg. MC *bjut [Baxter] < Sanskrit buddha 'enlightened one; Buddha'). There is indeed informational value behind "j" here but it's very unlikely a true semivowel or a palatalization of the preceding consonant. At some point then, we have to get back to reality, paying careful heed to creating a balanced, minimal orthography because overcomplexity quite simply hampers progress in all things.

16 Oct 2011

Egyptian vowel reconstruction and other gripes


Occam's Razor is a valuable tool to the student and scholar. It forces us to think hard about the assumptions we hold on to and whether they are absolutely justified or whether there's room for doubt. Linguistics seems to be one of those studies where this methodical principle is still not respected to the level that it should be and, as a result, there are many ancient languages being reconstructed with too much artistic flair to properly reflect the data.


Diversity of plausible theories or diversity of empty opinion?

I've been very busy collecting data on Ancient Egyptian after growing dissatisfied with the lack of profound discussion or clarity on its vocalism. Egyptologists constantly write words with only their consonantal values to reflect how the Egyptians themselves wrote these words. This is how it's always been. However I find that it often does more to obstruct and obscure the proper reading of these texts than aid us. It doesn't take a rocket scientist to figure out that Egyptians themselves wouldn't have thought of words purely in terms of consonants. Some of the clever word puns exhibited in Egyptian texts require our knowledge of the vocalism too in order to grok its fullest meaning and pattern. After centuries of Egyptomania, why is there no clear consensus on the Ancient Egyptian vowel system? What's the hold up? Are we interested in Egyptian or not?

To illustrate the point, let's take the word for 'cat' which may be represented consonantally as mỉw. Here is the mountain of possible reconstructions for the utterly confused outsider to select from:
  • Albright *mắȝĕʔ
  • Callender *máȝejvw
  • Garnot *mṓȝei̯
  • Smieszek *må̆ȝjᵉw
  • Vergote *māȝuy
Obviously they can't all be correct. Notice that a lot of these scholars seem to delight in masking their representation of the language with a bunch of unnecessary diacritics. (I've ranted against this before many times.) To aid in our investigation, we see that the plural form of the word is reflected in the Greek name Πανομιευς which represents the Egyptian phrase *pȝ-(n)-nȝ-mȝj.w 'He of the cats'. Of course, Egyptian shares with Arabic the use of broken plurals and so the plural vocalism is not necessarily the vocalism of the singular. In order to keep my sanity, I find myself forced to develop my own testable opinions on the matter with a conciliatory reconstruction of *māya /'mɑ(ː)jə/ for the period around 1500 BCE and it seems sufficient to account for later Coptic form moui agreed upon by Sahidic, Bohairic, Akhmimic and Fayyumic dialects.

Back to Occam's Razor, one thing that frustrates me when I see this kind of diversity of opinion and no consensus is that the reasons why these individual scholars have arrived at their differing ideas appears to be grounded less in linguistic science and more in artistic whim. To me, phonotactic analysis is unavoidable in this task. We need to be absolutely conscious about how syllables are put together in our language of interest, not just the individual phonemes. We need to start with the most universally commonplace rules and meet each contradiction with adaptation from a simple and commonmost state to a more complex and exotic one, not vice versa. Sadly linguists often don't demonstrate this rigour but it's vital in creating a coherent theory that obeys the KISS principle (ie. Keep It Simple Stupid). So, to me, the diversity of opinion in the example of 'cat' is not so much the result of coherent theories clashing for competition, but a bunch of lazy theories made by scholars ignoring Occam's Razor in their idiosyncratic ways.


And how to handle those unstressed syllables?

Focusing just on how different scholars treat unstressed syllables Egyptian, there doesn't appear to be a justification for how one decides which vowel it is, aside from appealing to outside branches of Afro-Asiatic like Semitic. Callender for example reconstructs *pAsīḏaw for 'nine' with wildcard symbol A whereas Loprieno chooses *pisī́ɟvw (nb. Loprieno's i = Callender's A) with yet another wildcard symbol v in the final unaccented syllable. In this case, Proto-Semitic having only *tišˁu has no equivalent cognate to enlighten our efforts on the matter.

Neither the Babylonian inscription EA 368 which records 
pi-ši-iṭ nor the later Sahidic Coptic form psis gives us much evidence of what the first vowel was because an unstressed vowel is often less audible than a stressed one. Coptic has already dropped the vowel while, for all we know, the Babylonians interpreted a garden-variety schwa as a lax -i-. I still search for precise evidence that justifies this need for more than one vowel quality in unstressed positions. Until I do, I reconstruct *pasiḏa /pə'siɟə/ where unstressed *a is nothing other than the generic schwa /ə/ which we would find in all unstressed positions. Notice too that I choose to avoid unnecessary diacritics like the plague, as I believe we all should if we strive to be good little linguists.

Naturally if there is indeed unambiguous evidence of other possible vowel qualities in unaccented syllables, I'd love to hear about it. But until I do, Occam's Razor must be my guide.