Showing posts with label long-range theory. Show all posts
Showing posts with label long-range theory. Show all posts

22 Feb 2008

The early Illych-Svitych on Indo-European and early Semitic contacts

I came across a paragraph from page 8 of Joseph Greenberg's Indo-European and Its Closest Relatives: The Eurasiatic Language Family (2000) (see link) that I found amusing:

"A new stage is reached in Nostratic theory in the work of Illich-Svitych, who is generally regarded as the founder of Nostratic in its modern version. His earliest comprehensive statement was published in 1967 in the form of a series of etymologies from the six families usually cited as 'classical Nostratic.' However, it is interesting to note that in a slightly earlier publication (1964), significantly called 'Oldest Indo-European-Semitic Linguistic Contacts,' he considered that the case for a relationship between Semitic and Indo-European was weak and that most of the resemblances were due to borrowing from Semitic by Indo-European."
You have to understand that Greenberg, the proponent of mass comparison to the chagrin of the rest of academia, was trying like most Nostraticists and related enthusiasts to promote his grand vision of an Ice Age protolanguage while tipping his hat to whom he exaggerates as 'the founder of Nostratic in its modern version'. The modern version as it was in 1950, more like. In fact, Illich-Svitych's reconstructions suffer from the same flamboyant problems that Starostin's did with an over-emphasis on parentheses and an under-emphasis on rational, regular sound correspondences. The kind of things that a trained eye immediately recognizes as poorly worked out and farfetched without needing to waste one's time examining it in closer detail.

But Illich-Svitych's reconstructions aren't what amuse me in the above quote. I find it oddly interesting how Illich-Svitych went from the more conservative idea that any similarities between Indo-European and Semitic are due to borrowing (as the case of 'seven' conclusively shows) to an unlimited self-indulgent anything-goes view that would give him carte blanche to hallucinate linguistic unicorns. What went wrong? How did he go astray like so many other misguided souls that fumble about with their painfully doomed Indo-Semitic comparisons?

Much like the more recent long-ranger named Sergei Starostin with his obsession with diacritics, Illich-Svitych had a peculiar, romantic soft spot for ejectives which he marked with an underdot in his transciptions. Markedness was violated everywhere according to his lavish brand of Proto-Nostratic and even some of the most common words in a normal human language such as pronouns and particles were replete with these articulatorily taxing ejectives. To add further irony, Allan Bomhard had recently offered an excellent alternative to Illich-Svitych's ejective-rich theory to finally address markedness problems by turning ejective stops into plain stops. Finally, real linguistics at work. Yet, since no one's perfect, Bomhard too committed the same mistake as any other linguist obsessed with the big picture over meticulous details by failing to recognize the undeniable layer of Semitic loanwords in PIE. This is why I still think that the many instances in his lexical comparanda that suspiciously fail to show anything but PIE and Afro-Asiatic evidence may be, if not complete red herrings, mere post-Nostratic loanwords.[1]

So I wonder: Is it possible at all for Nostraticists to move forward instead of running around in circles by finally recognizing the obvious? Can we all not see that PIE and Pre-IE was surely in contact with neighbouring languages throughout its prehistoric development (just like any normal language ancient or modern)? And if we can all reason that far, then can't we start taking the time to weed out those later post-Nostratic contacts first before we publish mere conjectures about a remote paleolithic language?

NOTES
[1] Ilya Yakubovich from the Department of Linguistics in the University of California in Berkeley writes in her online article The Nostratic linguistic macrofamily: "According to Moscow scholars, he frequently confuses cognate words with later borrowings (e.g. by trying to reconstruct Nostratic numerals)."

30 Nov 2007

A ramble about the Nostratic pronominal system

I'm sure I must have hinted before that I hate when some treat long-range theories (like Nostratic, North Caucasian, Dene-Caucasian, or whatever far-away proto-language) as if they're written in stone. A person with a level head recognizes these ideas for what they are, idle conjectures requiring many ammendments before something more substantial can be made of them. However, I'm not against conjecture as long as it's fully differentiated from facts or well-substantiated theories. I also think there is an important difference between sharing conjectures for discussion on a blog or in a forum versus wasting trees to write a manifesto of your pseudolinguistic doctrine for you to enforce on disbelievers.

As much as I sound like a conservative fart for downplaying long-range comparison, I'm actually quite interested in it. It's just that I haven't read anything serious enough for me to go "wow!" yet and as I learn more, the errors in books start to become more apparent. Overall, I'm the most impressed (in a very moderate sense) by the Nostratic hypothesis as presented by Allan Bomhard who proposes that Indo-European, Uralic-Yukaghir, Altaic, Eskimo-Aleut, Elamite, Dravidian, Sumerian, Kartvelian and Afro-Asiatic language families come from a parent language dated to about 15 000 BCE in a period following the last ice age. He wasn't the first to come up with this century-old theory but he had a few different takes on it. For now, Nostratic is not an established theory because it doesn't present enough evidence to prove its claims, but it doesn't hurt to suggest further improvements that may help to inspire discussion and, just maybe, progress.

When looking through Allan Bomhard's Indo-European and the Nostratic Hypothesis (1996) or The Nostratic Macrofamily: A Study in Distant Linguistic Relationship (1994) co-authored by Allan Bomhard and John Kerns, one thing that I noticed was how many pronouns are being reconstructed without a clear structure. This is but one of a number of serious gaps in this theory just waiting to be resolved. The reconstructions presented by Bomhard and Kerns are always cited ad nauseum in ablaut pairs (e.g. *ma-/mə-) which of course serves no other purpose than to make the book twice as long. Since the ablaut patterns are said to be regular, there is no need to cite the second pair of each reconstruction any more than it is necessary to cite the Indo-European root *bʰer- as *bʰer-/*bʰor-/*bʰēr/*bʰr̥- each and every time. So I will dispense with irrelevancies and cite only the first pair of each of their reconstructions below.

First off, Bomhard and Kerns, on page 3 of The Nostratic Macrofamily: A Study in Distant Linguistic Relationship, show us this list of pronouns in the 1st and 2nd persons: *mi "I" [1ps], *tʰi "you" [2ps], *ma "we" [1pp.inclusive], *wa "we" [1pp] and *na "we" [1pp]. Immediately after are "notes" which are hampered either by irrelevancies or false information. For example, it suffices to say that Indo-European (IE) has a 1ps enclitic pronoun *me, 1ps genitive *mene, verbal 1ps thematic secondary ending *-m and verbal 1pp ending *-mes, the last being nothing more than a 1ps element with the plural ending *-es. So indeed there is ample evidence of an underlying 1ps pronominal root *me- in the deepest recesses of IE's prehistory. It's development in IE's Celtic branch however is wasteful rambling since it's obviously immaterial to Nostratic reconstruction and *me is well established in all other branches of IE even without the consideration of Celtic. Basing an Afro-Asiatic reconstruction solely on Chadic is bad practice known as "reaching". The so-called Etruscan imperative endings cited (-ti, , -θi) are without substantiation, if not provably false altogether, despite ad hoc claims made by some prominent Etruscologists such as Giuliano and Larissa Bonfante. The belief that these endings are imperatives are based on ad hoc comparisons with Indo-European imperatives in *-dʰí (e.g. *h₁sdʰí /ʔəsdí/ "be!").

These aren't all the first and second person pronouns that are suggested by Bomhard and Kerns (see here). False comparisons are made between an underlying Uralic and Eskimo-Aleut 1ps ending in a velar stop on the one hand and Indo-European *h₁eǵoh₂ (cited as *ʔekʼ-) on the other[1]. Some fun pronoun splicing of random data from the Afro-Asiatic family and presto changeo, yet another 1ps pronoun, *ʔa-. Then don't forget Bomhard's 1st person pronoun *ʔiya, supposedly proved by evidence from Chadic.

So in the 1st person alone, we now have five claims: *ʔa-, *ʔiya-, *ma-, *na- and *wa-. I'll discuss this more later.

(Continue reading the sequel: A ramble about the Nostratic pronominal system, part 2.)

NOTES
[1] Read my views on the etymology of PIE's nominative 1ps pronoun in The origin of Indo-European ego.

25 Nov 2007

How NOT to reconstruct a protolanguage

I wrote an article last month, The Tower of Babel, which was an unexhaustive critical assessment of the late Sergei Starostin's grandiose online language project that limps on today through the efforts of surviving project members. A recent troll on that page under an unconvincing disguise of "G.Starostin" sent me two messages, one visible because it was civil if not misguided, while the second was abusive and thrown in the trash after I took note of his IP address. In case anyone was confused, my blog isn't a mouthpiece for proto-world rhetoric and I'm an ardent defender of mainstream linguistics despite my moderate interest in long-range linguistics. It suffices to reject the Tower of Babel project based simply on the consistent use of outdated and even disprovable information. Things such as its Indo-European database, infected with Julius Pokorny's 1950s reconstructions which notoriously neglected to reconstruct laryngeals to properly account for reflexes in Anatolian languages like Hittite, Luwian and Lycian. When a word halχ is assumed to mean “10” a priori in Etruscan purely by eyeballing texts and ripping words out of context in order to reject what is already established to be śar (c.f. Bonfante, Reading the Past - Etruscan (1990), p.61), Starostin's supplied pdf entitled Etruscan numerals: Problems and Results of Research by S. A. Iatsemirsky[1] is not credible enough to identify neither the problems nor the results of serious Etruscan research. Its Dravidian database is full of largely unaccepted reconstructions using voiced stops that are not proven to be necessary in that proto-language[2]. Then the addition of a Nostratic database and "Long Range Etymologies" is sure to add to the air of mediocrity of the website, putting the cart before the horse in light of the numerous mistakes regarding the more accepted languages and language families I just mentioned. This is all on top of the decidedly negative assessments of North Caucasian pushed by Sergei Starostin during his lifetime (Johanna Nichols, Current Trends in Caucasian, East European, and Inner Asian Linguistics (2003), p.208). I personally believe in an efficient use of time. So if it's proven that this website is consistently at odds with the mainstream, one would be wise to obtain a higher quality of information elsewhere.

On that note, it's important to discuss how NOT to reconstruct a protolanguage so that we're all on the same page and can more easily distinguish between real linguists and narrow-minded loons, whether online or in print. Considering that even Merritt Ruhlen of "Proto-World" infamy[3] has obtained his PhD from Stanford University, it's important to not be deceived by academic status. Theories can be ill-conceived no matter who one is or claims to be. So let's go through my cheeky list of important strategies that we can follow (using examples from the Tower of Babel project) if we want to isolate ourselves and be rejected by all universities around the world.

1. Use "phonemic wildcards" obsessively!
Cast the net wider and you might catch something!

The abuse of mathematical symbols like C, V, [a-z], (a/é/ö), etc. are an excellent way to make your idle conjecture look like a valid theory. It might be called "reconstruction by parentheses" since parentheses are either explicitly shown or hidden by a single variable. An example of this is *k`egVnV (claimed to be the Proto-Altaic word for "nine" in the Tower of Babel database). Obviously, if V represents all possible vowels in this proto-language and there are, say, ten of them possible in either position, then the fact that there are two wildcards in the same word means that the word represents a humungous, two-dimensional matrix of ONE HUNDRED possible permutations (10*10=100):

*k`egana, *k`egena, *k`egina, *k`egüna, *k`egïna, etc.
*k`egane, *k`egene, *k`egine, *k`egüne, *k`egïne, etc.
*k`egani, *k`egeni, *k`egini, *k`egüni, *k`egïni, etc.
*k`eganü, *k`egenü, *k`eginü, *k`egünü, *k`egïnü, etc.
etc.

Since no single form is actually being posited when wildcards are present, any claim of regular correspondence by such a theorist can be easily identified as fraud. If such linguists can't take themselves seriously enough to hypothesize a structured and testable theory, why then should we take them seriously in turn?

Other hilarious examples of wildcard fairy tales on the Tower of Babel site include Nostratic *cUKV ( ˜ č`-) "bundle" (in other words, all four are wildcards... jackpot!), Dravidian *kaṬ- "to cut into pieces" (universal onomatopoeia, anyone?), Semitic *ʔVrib- "tie (a knot)" (based on a single language, Arabic) and North Caucasian *ƛ̣_VẋwV ( ˜ Ł_-)̆ "rake" (wow, the number of possible permutations in this wildcard buffet is positively mindboggling! 200 perhaps?).

2. Ignore Occam's Razor and never seek logical justification for your ideas!
If an exotic phoneme gives you an orgasm, reconstruct it!

Most longrangers ignore Occam's Razor or fail to apply it in all aspects of their budding theory. It's easy to understand why it's not valid to reconstruct a sound in a proto-language which shows no regular correspondence in its daughter languages. However, even when one has justified a phoneme with evidence, one still has to justify the plausibility of the larger sound system that it's a part of. So if you have greater evidence for a palatal *ź than you do for its plain counterpart *z, you still have a problem to solve (c.f. phonemic markedness). If pronouns and common affixes use the more complicated sounds of the inventory of your proto-language, you still have a problem since this goes against the trend in languages we observe throughout the world, a reason that Allen Bomhard used to reject Illich-Svitych's reconstruction of Nostratic (e.g. Illich-Svitych and Dolgopolsky reconstructed the 2ps pronoun starting with the symbol *ṭ-, an ejective rather than its plain counterpart). This is how Occam's Razor works. In all aspects of our theory, we must abide by the simplest answer possible. Whenever you hear an argument like "Yeah, but, there's this language in some remote part of Africa with 30 speakers that uses a really rare sound or does something else that's really rare just like in my theory!" then you know that you're not dealing with someone in their right mind. Occam's Razor avoids unnecessarily exotic solutions at all times and teaches us to not confuse "minute possibility" for "convincing probability". For example, Klallam is certainly an existing spoken language, but there's also no doubt that its sound system and consonant clusters are very rare. So Klallam is something that your proto-language should not look like until you have solid proof (i.e. numerous regular sound correspondences) to back it all up.

By searching in the Tower of Babel's North Caucasian database for words beginning with sibilants, we get the following screwy search results. As of today, only one word with plain *z- in initial position is to be found, namely the first person pronoun claimed to be *zō, despite the fact that there are two instances of *ź- and *ž-. This means that plain *z- is outnumbered 3 to 1 by the comparatively more exotic counterparts with palatalization, labialization, clusters, etc. Even worse, there are only two instances of plain *s- among twelve roots starting with unvoiced sibilants. So plain phonemes are in the minority, as we would find if we were reconstructing a science-fiction language. Consistently, Starostin's North Caucasian defies any rational structure or common sense and a perfect example of diacritic overkill.

3. Make pages and pages of "correspondence tables"
They're sure to impress your family members!

"Correspondence tables" are lists of sounds in the daughter languages of a hypothetical proto-language proposed to prove regular correspondence and thus genuine relationship. So we can say that Germanic often corresponds to Latin t as Jacob Grimm remarked upon in 1822 showing that Germanic and Latin are part of the Indo-European family of languages. However, language isn't that simple and far more often than not, there are numerous exceptions to such simplistic equations. For example, the word 'eight' is octo in Latin and yet *ahtōu with a *t in Germanic. This is because the stop fails to be weakened to a fricative after another stop. What good then are correspondence tables when we can save time and space by actually describing sound changes and their processes? For some reason, Nostraticists and other longrangers like to use these at every turn, as does Sergei Starostin. These childishly repetitive tables simply waste pages and pages of paper and bandwidth without being terribly informative, but it's certainly an excellent way to make your book look thicker and impress your family.

4. Remember: All critics are conspiring against you!
Beat dead horses to death and if you can't win, punch them!

You may find that your theory isn't gaining the kind of press that you had hoped and quite a few may be noticing several flaws in your theory. You may not have a single factoid in your favour to form a coherent rebuttal. This is when you bring out the big guns: ignorance combined with non sequitur. This tactic must be handled delicately however. You could try attacking your critics on the personal level, whether that be through the direct use of swearwords or through subtle mockery of your opponent. However this is a desperate last resort, more common on Yahoo! Forums or Youtube. It looks more professional however to simply ignore critics altogether while overpraising the capabilities of yourself and your associates. Using a plethora of unnecessarily sesquipedalian, multipolysyllabic megaterminology, such as "lexicostatistical", is a great tactic to conceal the weaknesses of your theories, as is treating your conjectures as proven facts in any of your publications so as to not bog down your important work with silly things like justification or common sense. Remember, all critics don't know what they're talking about. Their valid criticisms are just a devilish trick of theirs to throw you off-track and pull you off of your hobby horse.

NOTES
[1] Note that this pdf incorrectly cites TLE 295 in reference to a word zar when in fact it's properly TLE 275. Furthermore, automatically assuming that zar and śar are the same word purposely ignores phonemic distinctions in order to stroke one's pet theory. The instance of huθ-zars declined in the genitive case (TLE 191) has absolutely nothing to do with zar and everything to do with the fact that a dental stop plus the initial sibilant of attested śar (TCort ii) yield z // in this one particular instance. It's all quite understandable once one puts in the time and effort learning the basics of Etruscan phonetics.
[2] See Krishnamurti, Comparative Dravidian Linguistics: Current Perspectives (2001), p.250 [click here]
[3] Visit Mark Rosenfeld's humorous but rational article on the Proto-World language and its associated failures in reasoning: Deriving Proto-World with tools you probably have at home. One of the most poignant criticisms towards the proposals of Merrit Ruhlen and Joseph Greenberg (R&G) that I appreciate here is: "R&G really gain the benefit of obscurity here: how many of us can determine whether they are (unconsciously) playing the same kind of tricks with Tfaltik and Guamo as I am playing with Chinese and Quechua here?" This criticism is equally applicable to Starostin's theory of North Caucasian and his Tower of Babel project where a similar "benefit of obscurity" is being used against his readers.

UPDATES
(Feb 14 2008)
My entry The hidden binary behind the Japanese numeral system exposes another flaw in Starostin's reconstructions concerning the origin of Japanese numerals.

2 Oct 2007

The Tower of Babel

Sergei Starostin, before sadly passing away from a heart attack at the early age of 52, headed the Tower of Babel Project, an online project whose name is in unfortunate reference to a Christian biblical myth about language origins as given in the book of Genesis. Ironically, much like the bible and its unquestioning literalists, many linguaphiles online are in the habit of quoting Starostin and holding his theories in great esteem while ignoring all the deep problems of logic that are immanent in them.

Now, before you believe in wiki-haste that I'm just a nagging oddball, please note that my view is on the side of academics who have large issues with Starostin's work. There is no conspiracy against Starostin and his followers. We don't have a hate-on for him as a person and I'm sure he had a jovial spirit by all accounts, but these critiques arise because his theories are simply not tenable and too far-flung to be of use to a disciplined linguist. If people feel too personal about linguistic critiques, it's their issue to deal with.
  • http://www.accessmylibrary.com/coms2/summary_0286-25704408_ITM
    "With fewer than 300 linguists in the world doing serious work on long-range comparisons, the discipline is small and perennially insecure about its scientific standards. Given the dearth of rigorous proof for some of Starostin's assertions, many American linguists felt within their rights to dismiss his research or at least to exclude him from their conferences and symposiums."
So let's see why this is the case from examples we can see within Starostin's own project. At his site, a myriad of proto-languages seem to be tackled all at once but it centers around his personal baby: "North Caucasian" (note that the name refers strictly to the languages of the northern regions of the Caucasus mountains, not to a racial term). Throughout his life he tried to force the two language families called Abkhaz-Adyghe and Nakh-Daghestanian into the same, proverbial, round hole. The view of actual Caucasian language specialists is that Abkhaz-Adhyge (AA) and Nakh-Daghestanian (ND) are entirely different in structure, phonology and word order. While AA tends towards simple three-vowel systems (some even claim two-vowels), ND is radically rich in vowels and consonants. The two families couldn't be any more different from each other. Plus, any similarities between the two could easily have been caused by strong areal influence between the two within such a small region and over a large expanse of time.

No matter. Starostin was unfazed and he set to work to create a hybrid language, more like a conlanger than a comparative linguist. He no doubt figured that his language would serve as a valid ancestor to both language families. Only one problem. It only works if you ignore rigourous methodology. Here are just a few flaws that I notice with North Caucasian that proponents just don't care enough about:

1. Violations of phonological markedness are everywhere
I think Starostin had a "diacritic addiction". For example, all pronouns like first person *zō violate markedness because plain *z is so rare in his convoluted phonology while palatal *ź is used to explain almost everything (see his list of 'z words' here) where his diacritic bias is self-evident. Many phonemic exotica are far too often employed, such as *ƛ̣, to make up for his shortcomings in taking the time to adequately demonstrate sound correspondences (e.g. the ambiguous reconstruction *Ł_ĕɫV̆ with link here). Through the use of contrived sounds as a smokescreen, void of a discernible phonological structure that all human languages have, he can freely connect different etyma together no matter how large of a leapfrog we have to hurdle to swallow it, making it seem to laymen as though North Caucasian has a lot more evidence behind it than it actually does.

2. The pronominal system is unnatural

Considering that his reconstructed pronouns curiously use only voiced consonants, an attentive linguist might consider the effects of sentence-internal lenition on grammatical elements and more sensibly reconstruct unvoiced *s for *z in the 1ps pronoun or *t instead of *d for the 2ps pronoun. One might also reduce the bloated phonemic inventory to a more manageable level that could then be finally accounted for systematically by demonstrable proof, instead of leaving it all hanging as an empty assertion.

3. Shoddy claims in more understood language groups of his database compromise his credibility.

Nothing could be more far-flung than Altaic **séjra "three" (link here) when Altaicists reconstruct *göl- "three" on more direct evidence (Mongolian gurav, Japanese kokono-). Despite the fact that it is agreed upon by specialists that Proto-Dravidian is reconstructed without voicing contrast in stops (read here), Starostin took an anarchist approach by representing Dravidian with them (link here).
The number of protolanguages in his database and the wide variety is impressive. The quantity of seeming information alone would cause many to believe that his work is exceptional. However, quantity is not quality. I think that what the issue was with Sergei Starostin was that, like many language lovers, they failed to see that linguistics is a science and not a form of artistic expression.