Paleoglot: protolanguage

Showing posts with label protolanguage. Show all posts

3 Jan 2012

Baxter-Sagart reconstructions and Occam's Razor

The internet abounds with information if we make the effort to search. One interesting find is a pdf of the Baxter-Sagart reconstruction of Old Chinese roots in tabular format. Excellent! But being an analytical bad news bear, I also see some important issues that tie in with my stance on developing orthographies that properly conform to Occam's Razor. This is out of respect for logic, for necessary simplicity, for clarity and for general readers, some of whom may not be well-versed in linguistics but which nonetheless are interested in the beauty of a language and its history.

Contempt for Occam's Razor inhabits even mainstream linguistics and the field is far too often misconceived as an intuitive art than a logical science. I put my money on organized phonologies and uncluttered orthographies that express only what's necessary for the topic at hand. It's not necessary to show exact phonetics of a word each and every time when the discussion is not about the exact phonetics of a language. If we have a list of roots, it doesn't make sense to list it all out in excruciating phonetic detail any more than it makes sense to write English this way. As such, mixing IPA symbols into your orthography often spells more trouble than what it's worth. "IPA" doesn't stand for International Orthographic Alphabet. At some point a decent linguist must come up with a sensible, legible, optimal, uncluttered orthography to express their language of study beyond the microscale phonetic level. A means, in other words, to quickly and clearly cite words in a vocabulary, pruned for immediate and sufficient comprehension by an everyday reader. Abusing symbols to complicate the message is as corrupt a practice as abusing unnecessary specialist terms for little other reason than for show.

On the top of the list, the Baxter-Sagart team begins with roots like *ʔˤra. This shows us that they envision a phonemic pharyngealized glottal stop. Fine. However unless */ʔˤ/ is phonemically distinct from other phonemes in the language, say */ʕ/, why be so precise on the orthographic level? Why not use a single clear symbol for this instead of mixing up orthography with the phonetic level far below it? If the orthography, in its necessary simplicity, doesn't make the phonetics you intend very clear, one may simply write a quick primer on it and be done with it. If only this, then I can concede that perhaps there's some reason for it that I've overlooked.

Further down the list, we also have *qˤrep which is quite the tongue-twister. One may dismiss this as within the bounds of plausibility although I do admit that this apparent pharyngealized uvular stop is unusual for its Schrödingeresque ability to inhabit two places of articulation at once. Then again, there are many consonant rich languages like Klallam around, right? We also have to keep in mind though that these kinds of languages are also quite rare and there's nothing scientific and methodical about a theory that strives towards the exotic rather than the minimal. Strong proof should come before the addition of a new phoneme to a reconstruction.

But when we come across *qʷʰˤat-, what is Baxter and Sagart trying to express to us and how does it fit into a plausible phonological system? A labialized, aspirated, pharyngealized, uvular stop??? How on earth could this possibly be contrastive with another phoneme? Surely at this point we have to concede that Baxter and Sagart have not respected the differences and proper uses of phonetic versus orthographic transcription. It gives the impression of a poorly organized phonology and orthography, mixing exact and even unlikely phonetic symbols together to create a visual mess that ends up being more confusing to the reader than helpful. At this point, it's just not reflective of the facts, even when (and especially when) armed with knowledge of the IPA system!

Keep in mind that there are already expressed concerns by others about the use of "j" in Middle Chinese onsets in words like gji (祇 qí) considering that the "phoneme" doesn't seem to exist when compared to some loanwords coming from outside Chinese (eg. MC *bjut [Baxter] < Sanskrit buddha 'enlightened one; Buddha'). There is indeed informational value behind "j" here but it's very unlikely a true semivowel or a palatalization of the preceding consonant. At some point then, we have to get back to reality, paying careful heed to creating a balanced, minimal orthography because overcomplexity quite simply hampers progress in all things.

12 Nov 2011

A matter of the Egyptian heart

The Egyptians placed a lot of importance on the heart and it was believed to be the seat of the mind and the soul. In the English-speaking world, we usually treat "heart" as a symbolism of the feelings but for ancient peoples around the Mediterranean, it was instead the seat of reason and essence. They didn't realize yet the significance of the brain in that regard and of the bodily organs that Egyptian mummifiers traditionally preserved in their sacred rites, the brain wasn't one of them.

Considering how central the heart was to the ancient Egyptian perception of the soul, one would think we'd know how to pronounce the word by now. In hieroglyphs, it's represented only in consonants and we write this in standard orthography as ỉb. This unfortunately gives the false impression that we should just assume a pronunciation of /ib/. Indeed, Antonio Loprieno does reconstruct */jib/ and compares it directly with Semitic *libb- assuming in turn an Afro-Asiatic reconstruction of *lib (see Ancient Egyptian: A linguistic reconstruction [1995], page 31). So isn't that our answer?

I'm beginning to think it isn't. For one thing, this reconstruction could only work for the earliest stage of Egyptian before all instances of word-initial *y- were nullified in the language. Since the reed leaf symbol came to represent a glottal stop as a result, by the time of Middle Egyptian, we could only have had *ib at best. So isn't this our answer then?

To be honest something still seems off. The related Cushitic branch seems to instead point to *lub- with a rounded back vowel. If we derived an expectation of the Egyptian form from that piece of external data, we'd arrive at *ub, not *ib! Adding to the difficulty is that Coptic has replaced the word for "heart" with a completely different word, hēt (from ḥȝty). No clues there.

So what can we rely on to decide the matter? I finally came across the Hebraicized name Ḥophraˁ, the name of a pharaoh of the sixth century BCE. The original Egyptian form is represented in hieroglyphic writing as wȝḥ-ỉb-rˁ. It suggests that ỉb was at that point pronounced like the -oph- in Ḥophraˁ, causing me to want to side with the Cushitic reconstruction. Therefore *ub seems far more sensible than Loprieno's *(y)ib.

I'm curious about this word lately and want to get it right because of the parallel Proto-Berber form reconstructed as *ulβ. I wonder then if this might suggest that Proto-Berber had coloured the prothetic vowel with the original quality of the root vowel now lost between the two surviving consonants. If so, I have no clue how to account for the *i in Semitic *libb- however. The Semitic vocalism of the root now becomes the outlier.

10 Nov 2011

The reconstruction of the Pre-Egyptian case system

Antonio Loprieno states something confusing to me on page 55 of Ancient Egyptian: A linguistic introduction (1995):

"Also, the ending *-u is still preserved, although functionally reinterpreted, in the forms of some singular patterns as well: when the original stem ended in a vowel, for example *u in *ḥāruw '(the god) Horus,' *-a in *ḫupraw 'form,' or *-i in *masḏiw 'enemy,' the ending was maintained as a glide, often written in good orthography as <-w> in the case of *-aw as opposed to <-ø> in the case of *-iw or *-uw: <ḫprw> =: *ḫupraw 'form,' <ḥfȝw> =: *ḥaf3aw 'snake.'"

Stated more directly, he's claiming that the *w in *ḫupraw was written by scribes according to "good orthography" while strangely ignored in *masḏiw and *ḥāruw despite being present in all these words. It's hard to understand why that would be so. It's rather as if we have *ḫupraw with *w but *masḏi and *ḥāru without. But then this would be inconsistent with what he's stated on the development of the case system from Pre-Egyptian into Old Egyptian.

So it seems that either I'm missing something here or his theory needs a few tweaks. If I ventured an attempt at revisal, perhaps we could try Pre-Egyptian nominatives *ḫaprúwu, *másḏiyu and *ḥārawu. After reduction of unstressed vowels, this becomes *ḫaprūwa /xəpʰˈɾəwə/, *masḏi /'masɟi/ and *ḥāru /'ħaːɾu/ before the case ending was omitted altogether: *ḫaprū, *masḏi and *ḥāru. I contend that only the first word ever motivated writing w. I question its existence altogether in the pronunciation of the second. In the third, 'hawk', I suspect the word was built on the notion of 'that which is above', consisting of *ḥar 'above, upon' and an ancient masculine suffix *-aw, becoming therefore *-u. As such, it couldn't have consonantal w during literate times either since we have only a short vowel. This then explains Loprieno's "good orthography" which now reflects a transparent, underlying reality. No more arcane scribal rules on whether or not to write the trailing semivowel. No more wildcard symbols either, as I've shook my fist at beforehand.

7 Nov 2011

Changes in Pre-Egyptian vocalism

Lately I've been reflecting on what Loprieno says about the early Egyptian vowel system on page 55 of Ancient Egyptian: A linguistic introduction (1995):

"In our discussion of phonology (section 3.4.3), we saw that one of the major features of Egyptian in its early stages was the presence of a strong expiratory stress, which eventually caused a reduction to /ø/ of short vowels in open syllables in posttonic position, with the resulting change from the Dreisilbengesetz to the Zweisilbengesetz (**saḏimat > *saḏmat 'she who hears')."

While Loprieno speaks of reduction to zero, I've long been thinking more along the lines of a Pre-Egyptian system of *a, *i and *u being reduced to *schwa* wholesale in all unstressed positions. To begin with, long vowels were only to be found in stressed positions in Pre-Egyptian, at least if the comparison with Proto-Semitic is trustworthy, and this length contrast in stressed positions clearly remained in Egyptian, as still evidenced by Coptic. I therefore choose to write all of these reduced, unstressed monophthongs of Pre-Egyptian as *a (to be implicitly understood as [ə]). Furthermore diphthongs *Vy and *Vw (*V = any vowel) then become *i [əj] and *u [əw] respectively. This has worked very well for me for a while now. The result is an Egyptian vowel system that still looks on the surface much like Proto-Semitic with long vowels restricted to stressed syllables and unstressed positions having only short *a, *i and *u. Yet since the system has been notably altered, we find a curious incongruence nonetheless between the vowels of Proto-Semitic and those of Egyptian.

We can also avoid a lot of the wildcard symbols Loprieno and others occasionally use in the unstressed syllables this way since my theory makes this pointless: Only *a can exist in these positions unless accompanied by a written semivowel y or w in which case the appropriate short high vowel is selected. It appears that the matter of whatever the original vocalism may be is an issue for Pre-Egyptian reconstruction, not Egyptian proper. Loprieno's */'ri:ʕuw/ (> */'ri:ʕə/) 'sun' becomes my *rīˁa.

There are further reasons why I'm dwelling on this, but I've divided it up into subsequent posts.

27 Oct 2011

Small quibbles about Proto-Berber orthography

Phoenix responded to a minor issue I raised about Proto-Berber orthography in Why I reconstruct *β and not *v. In defense of using a relatively arcane symbol *β (taken from the IPA system) for a v-like sound that could instead be accommodated by a straight-forward symbol *v, he supplied the following reasons:

"In African linguistics v is commonly used as the symbol for the voiced fricative while β is used for the labial approximant."
"So I don't use v to transcribe Proto-Berber β, because it would suggest that it is the fricative counterpart to *b."

So from what I can see, his justification for the specialist symbol boils down to phonetics and tradition in the field. However I fail to find any justification here grounded in a clear methodology of some kind.

To the first argument, I suggest that basing an orthography on the phonetic level is inevitably cumbersome because it's then prone to constant revision as new discoveries about underlying phonetics come into view. A more stable and sensible orthography is based on the higher phonemic level instead, which focuses less on exact articulation of each sound in its context but instead displays for us *distinct* sounds of the language. For example, in English, the phoneme /p/ is pronounced differently in "spun" than it is in "pat". The /p/ in the former example is completely without a puff of breath (ie. [p] in IPA symbols) since it follows /s/ while in latter example, /p/ is indeed pronounced with a puff of breath by default (ie. [pʰ]). However on the higher phonemic level, we represent in both examples the single phoneme /p/ to eliminate extra irrelevancies that are ungermane to the focus at hand. It'd be likewise unnecessary to write out every word of a proto-language like Berber with only phonetic symbols rather than phonemic ones unless the topic was specifically about the exact articulation of each sound.

It's also a fact that there are exceedingly few if any languages that contain two distinct phonemes /β/ (bilabial fricative, pronounced by blowing through near-closed lips) and /v/ (labiodental fricative, pronounced with the lower lip touching one's upper teeth). It's pointless to obsess on minutia about the exact articulation of the sound if it can be reasonably ascertained that the sound was v-like. It then suffices to take advantage of an available letter from the Roman alphabet, *v, to aid readability both by specialists and by people in general. Things should be written with clarity for both specialists *and* the general public when possible lest it encourage ivory tower attitudes, the scourge of current academia.

To the second argument, tradition indeed is a seductress but it must be rejected when it no longer clarifies but obfuscates. Sometimes tradition is misguided. Sometimes tradition is outdated. Sometimes tradition is just plain wrong. In this case, I feel that this tradition is wrong precisely because of the first argument, that orthographies should reflect the phonemic level not the phonetic and that by ignoring this rule, one has unnecessarily obfuscated rather than clarified.

Possible solutions

After reading Phoenix's explanation with deep interest, I pondered on how the system might be revised to be clearer and to follow a more consistent methodology in its design. By following the principle of phonemics over phonetics, and by reserving diacritics and special symbols for the rarer sounds of a language marked by special articulatory features, we can arrive at a more balanced and clearer phonology.

Breaking with empty Berberist traditions, emphatic sounds may be marked by the underdot, as in Proto-Semitic studies. Again, we all may quibble about the exact pronunciation of *γ (or *q) but a revised symbol *ġ has the definite advantage of visibly showing a shared feature of "emphatic" with the other emphatics which would likewise be indicated more consistently with the dot: *ḍ, *ḍḍ (former *ṭṭ), *ġġ (former *qq), *ẓ and *ẓẓ. The missing emphatic counterpart of *b, represented in this new system as **ḅ, is now impossible to confuse with non-emphatic *v which lacks the underdot. We may finally eliminate unnecessary IPA symbols and replace them with more generally readable symbols from the standard Roman alphabet that we already use while simultaneously making explicit any shared features that the different sounds may have in the language, such as "emphaticness".

And finally, through this revised system, specialists may continue to debate on the exact articulation of *ġ and such, but it won't affect the symbol shared among the specialist community until the phoneme's emphatic nature or its existence is disproven.

UPDATE
(1 hour later) Upon further thought (my mind never stops!!), enforcing a surface representation with unvoiced letters might be even more kosher and, again, this would be even more in line with what's done in Proto-Semitic linguistics. So alternatively, we could use the following symbols to clean things up: *ṭ (= *ḍ), *ṭṭ, *ḳ (= *γ), *ḳḳ (= *qq), *ṣ (= *ẓ) and *ṣṣ (= *ẓẓ).

12 Dec 2010

Giving and having in Indo-European

In my last post, I was noticing the link between Etruscan genitives in "give" constructions which mark the recipient of a gift and clauses conveying "having", as per John's Newman's Give: A cognitive linguistic study (1996). On that note, there are some extraneous connections that come to my mind in other ancient languages I know of.

I've reasoned for a while now that the source of Indo-European's thematic genitives in *-osyo like *h₁éḱwosyo 'of the horse' is quite simple: the athematic genitive *-ós plus endingless relative pronoun *yo-. This construction would have first developed in Pre-IE (specifically Late IE) as *-asya, replacing former accented genitive *-ás, when Acrostatic Regularization risked making the nominative and genitive identical in the thematic paradigm. The addition of *ya (the original endingless form of the relative pronoun used for nominative, locative and inanimate accusative cases) helped disambiguate and reinforce thematic genitives. This resultant construction, instead of conveying the direct but potentially ambiguous phrase "of X", used the circumlocution "(with) which [is] of X".

With Newman's insights, we might even reinterpret "which [is] of X" as "which X [has]" since a lack of "to have" in Proto-Indo-European encourages a speaker to use the verb "to be" plus a genitive noun to express the possessor. The distinct but semantically equivalent phrases we take for granted in English like "the horse's speed", "the speed [which is] of the horse" and "the speed [which] the horse has" all become a little blurry in such languages.

Then I wonder further. I've already noticed that there's no rational motivation to reconstruct a distinct dative case in pre-IE, if not in IE itself^[1]. The dative in *-ei must have only later originated from the pre-existing locative ending in *-i and/or from analogy with *h₁ei- 'to go (to)'. So in pre-IE or IE, without an available dative form, what case is left to express the recipient in phrases using the verb *deh₃- 'to give'?

NOTES
^[1] Francisco Adrados, On the origins of the Indo-European dative-locative singular endings published in Languages and cultures: Studies in honor of Edgar C. Polomé (1988), p.29 (see link).

4 Oct 2010

Vetch and pea sail to Italy

There are many things to discuss lately. For example, on Phoenix's blog, Proto-Indo-European reduplication is revisited and I might have a few more thoughts on this. However, for now I'll complete the short thread concerning my previous suggestion of Aegean roots for 'pulse' and 'vetch', this time slightly modified to *árapu (> Minoan *árapu > Gk ὄροβος 'bitter vetch') and derivative *árapinta (> Minoan *arápinta > Gk ἐρέβινθος 'chick-pea').

What I wanted to share is that there are further interesting comparanda apparently isolated in Western Europe that many other scholars also believe are indicative of some sort of substrate, although no one is very specific about its transmission. Of course, as always, it's this vagueness that drives me nuts, so let's explore this more:

Latin ervum 'pulse, bitter vetch'
Germanic *arwītō 'pea' (hence OHG arawiz)

As anyone can see, it's relationship to Gk ἐρέβινθος is clear. Yet trying to explain this away with Indo-European roots isn't the solution here. Some Indoeuropeanists have nonetheless attempted to reconstruct some ridiculous roots like (*)*orgʷindʰ- or (*)*h₃ergʷindʰ, for example, which fails to address the incoherence of Germanic *-w- beside Greek -b-, not to mention the erratic vocalism (ie. Germanic *ar- vs. Greek er-)! Surely a substrate word must be at work here, not an inherited Indo-European root with a whack-load of irregular sound changes.

Then there's also Latin arbōs ~ arbor 'tree'. According to the OED, Latin arbōs is of "unknown origin". As usual, some obsessive Indoeuropeanists have attempted to explain this word away as yet another IE root (eg. Julius Pokorny and *erəd- 'to grow'). These numerous "Western IE" roots fail to convince and it's interesting that arbōs is localized purely within the Italic branch. For that matter, what other Italic cognates exist alongside this Latin term, if any?

I'm also interested in the history of Latin herba 'grass'. If we include this and arbōs as part of the substrate evidence, could the meaning of this underlying root be more general such as 'sprout', I wonder. I'll have to look further and see what other ideas have been published on these interesting words.

If we trek onward and theorize an Etrusco-Rhaetic cognate in Italy, and given my latest rules of sound correspondence, we should then expect *arpu 'sprout', which would explain both ervum and arbōs in Latin, and *arpintʰ 'pea', which would explain Germanic *arwītō (perhaps via a Venetic intermediary, *arwi(n)ton).

25 Sept 2010

Adapting the rule of Cyprian Syncope

Recap: What is Cyprian Syncope?

Cyprian Syncope is a sound rule that I noticed on my own several years ago when first pondering on the language origins of Etruscan. Having recognized like many others that Minoan must fall under a Proto-Aegean language family, distinct from Indo-European or Semitic, I then reasoned that Etruscan phonotactics must have been simpler in its more recent past, aligning more with the much-stricter phonotactics of Minoan which only appears to have allowed syllables of a (C)V(C)-shape.

Cleaving Proto-Aegean into two branches, Minoan and Cyprian, I noticed that some Minoan vowels were being deleted in later Cyprian tongues due to some sort of very early stress accent, sometimes creating new word-initial consonant clusters that couldn't have been possible in Minoan. Etruscan, Lemnian and Rhaetic all have word-initial consonant clusters, showing that if they were created from vowel deletion, this must have occurred when they were once a single idiom back around 1000 BCE (ie. when these languages first arrived in Italy). This rule of syncope is unrelated to a later second syncope in Etruscan which has already been widely remarked by past Etruscanists and which took place around 500 BCE. As far as I've read, no Etruscanist has published a word on this first Syncope that I'm exploring openly here, as I have in the past online.

A slight change

This past week, reviewing my research, a new corollary on Cyprian Syncope came to me. Vowel deletion isn't always guaranteed, it seems, and I've been striving to understand why. Certainly I long ago saw this in derivational suffixes of a CV shape, eg. Proto-Aegean *-na [pertinentive] becomes both Minoan and Etruscan -na without vowel deletion. I also noticed later that a word-final structure of -CCV within a word also blocks vowel deletion. Thus the original structure of Proto-Aegean *tʰaura 'bull' (> Greek ταῦρος) is likewise preserved in Etruscan θaura. Recently though, I've been grappling with other notorious wanderworts like 'apple' and 'bee' in Western Europe, seeking Aegean solutions to these riddles, only to find that there is a new implication that some trisyllabic words with initial accent fail to delete the word-final vowel.

Without going into details about reconstructions I haven't yet detailed on this blog, I think I've arrived at a very phonetically plausible revision of the general vowel deletion rule by noting a preceding accent shift in specific cases. Thus:

1. Euphonic Accent Shift: Word-initial *CəCV́- where both consonants (C) are plosives attracts stress to the first syllable: *CəCV́- → *CV́Cə-

2. The Cyprian Syncope Rule: Any vowel in a syllable immediately preceding or following a stressed syllable is deleted.

The reason for the initial accent shift prevents consonant clusters like those perfectly valid in Greek (eg. κτεατίζω 'to gain' or χθών 'earth') from ever forming in Cyprian, thereby explaining why they are completely absent in Etruscan despite having several Greek loans.

The following table shows the regular patterns in correspondence I witness that are emerging from the attested and substratal data and will hopefully illustrate how the above rules can explain them:

Proto-Aegean	Cyprian (before Syncope)	Cyprian (after Syncope)
*aléli 'lily'	*^əlél^ə****	*lel
*ápia 'bee'	*ápi^ə	*ápi
*apísa 'pear'	*^əpís^ə****	*pis
*árapo 'sprout'	*ár^əpu	*árpu
*talóza 'sea'	*t^əlúz^ə****	*tlus
*ṭapúri 'village'	*z^əpúr^ə****	*spur
*ṭínau 'moulded'	*zín^əu	*zinu
*tʰáura 'bull'	*tʰáura	*tʰáura

22 Sept 2010

From whence Sanskrit kapúcchala?

As I've probably mentioned before, I strongly suspect Julius Pokorny and followers have lazily lumped Sanskrit kapúcchala 'tuft of hair from the back of the head' in with other evidence supposedly supporting Proto-Indo-European (PIE) *kaput 'head', all just to give the illusion that the evidence is more robust and geographically dispersed than it honestly is. It's also a lot easier in any bureaucracy, including in academia, to simply go with the flow and ne'er question the status quo. However in this case, I'm fortunately not the only one out there that thinks this smells fishy. I insist that this PIE root never existed and that there are only Western European reflexes of this 'head' word, all attributable to loans from the Aegean family during the 2nd millennium BCE and later, ie. from either Minoan *kaupada (> Greek κεφαλή) or Etruscan *kaupaθ (> Latin caput; indirectly into Germanic as *haubidaz prior to Grimm's Law, perhaps through Venetic).

Though I found one lead online stating that Mayrhofer once dared to analyse kapucchala into a pejorative prefix ka- plus puccha- 'tail' (Mayrhofer, Kurzgefaßtes etymologisches Wörterbuch des Altindischen [1956], p.157), I've just come across a curious entry in both Cologne Digital Sanskrit Dictionary and Capeller's Sanskrit-English Dictionary that identifies the syllable क ka alone as 'head'. This tickles me. Since I knew already that पुच्छ puccha meant 'tail', this implies that कपुच्छ ka-púccha-la- with diminutive -la- just means 'little head-tail', perfectly fitting for a tuft at the back of the head.

If the word can be explained purely in Sanskrit terms, a PIE origin would be woefully extravagant by comparison and then easily dismissed as bunk. The other spelling kaputsala would be just an alternative phonetically-faithful rendering and certainly adds nothing to the arguments of the **kaput camp until they can substantiate both **kaput and **śala-. Even the justification for this unmotivated segmentation of the word is lacking. It seems to be based on wishful thinking.^[1]

That being said, now I'm having trouble confirming the source of the equation ka = 'head'. Is it attested somewhere directly? Or is this purely assumed by 19th-century Indicists attempting to etymologize Sanskrit vocabulary (in which case, an asterisked *ka is in order)? Oddly enough, there a few other words that strongly seem prefixed with this morpheme ka-: क-स्तम्भी kastambhī 'prop of a carriage-pole' (cf. स्तम्भ stambha 'post, pillar') and कं-धर kaṃdhara 'neck' (lit. 'head-bearer', cf. धर dhara 'supporting').

Rejecting PIE **kaput, what then is the etymology of Sanskrit क ka 'head'?

NOTES
^[1] After posting this, I managed to discover one tantalizing lead that may help settle this issue (see Brugmann/Streitberg, Indogermanische Forschungen, v. 3 [1894], p.236) which I subsequently posted in my commentbox. If I'm reading the German correctly, it seems like the authors are admitting that kaputsala was caused by a more modern modification of kapucchala, based on an etymological whim.

4 May 2010

New pdf on Indo-European verbs

I've put up a new pdf in my Lingua Files section on my views about Proto-Indo-European (PIE) verbal inflection. This pdf is a culmination of many of the posts I've already pushed out on this blog.

As a recap, I had come to a couple of major revelations on PIE that diverge from the "mainstream" but problematic view:

One: The unlikely phonological system can finally be rationalized by turning palatal stops to plain ones and plain stops to uvular ones while shifting phonation to a contrast between creaky and plain voice rather than plain versus breathy.

Two: The traditional "present-aorist-perfect" verb model (which is notorious for being an inadequate model representative only of a post-IE stage) can be reworked into an earlier two-dimensional system of subjective/objective versus progressive/non-progressive to now explain why Anatolian & Tocharian verbs behave so differently.

Now, I use the term progressive to specifically refer only to an affirmative, ongoing action in the realis mood while non-progressive covers everything else, including negative actions regardless of aspect or tense. I've modeled this system partly on what I know of the pecularities of the Mandarin verb which is also tenseless.

This makes for a very different PIE but these drastic changes are unavoidable if we are to solve some problems that have thus far gone unsolved. I've dared to theorize, if anything, for the sake of my own personal understanding and exploration, but hopefully my summary will also help anyone else interested to understand at a glance what I'm getting at and/or inspire others to blog their own insights and innovative solutions.

9 Jan 2010

Rubbing away the shine (2)

More skepticism of PIE *mer- 'to shine' is to follow. Apparently it was back in 1891 when Friedrich Müller questioned the sense behind attributing the Vedic storm deities called Maruts to a root meaning 'to shine' as opposed to a homophonous root meaning 'to crush', reasoning it out thusly:

"Another etymology, proposed in Böhtlingk's Dictionary, which derives Marut from a root mar, to shine, labours under two disadvantages; first, that there is no such root in Sanskrit; secondly, that the lurid splendour of the lightning is but a subordinate feature in the character of the Maruts."^[1]

None of these facts have changed. The verb mṛṇā́ti 'crushes, grinds' is always available to the Sanskrit etymologist but a verb root paralleling Greek marmáirein 'to shine' is absent. Does a 19th-century scholar still have a point? Have Indo-Europeanists gotten ahead of themselves attributing a PIE root behind every relationship blindly? Skepticism concerning this root, in regards to another meaning given to it which strives to explain Greek words relating to 'portion' and 'fate', is echoed more recently by Peter Schrijver in Indo-European *(s)mer- in Greek and Celtic published in Indo-European Perspectives (2004): "Yet the other cornerstone of IE reconstruction beside archaic morphology, viz. comparative evidence from other IE languages, would seem to be almost completely lacking."^[2]

If मरुत marút may be so etymologized, such that these storm gods 'crush' and 'pummel' with thunder^[3] rather than 'shine' through lightning, then surely so may Sanskrit márīci- 'mote or speck in the air' or 'particle of light' be likewise attributed to the homophonous root referring to crushing, grinding and wearing things away. Latin merus 'pure, unmixed, unadulterated' can also make better sense this way too (ie. 'worn away' → 'mere' → 'unadulterated', 'pure', 'bare'). Nothing here requires a source from 'to shine' and the issue seems even to become burdened by extra assumptions when we do. So it really begs the question whether it existed at all in PIE. Perhaps we should wonder from where Greek obtained marmáirein and related words pertaining to 'shining' if not from PIE and resist a biased tendency to see Indo-European in everything beyond what's sensible.

NOTES
^[1] Müller, Vedic Hymns, Part I: Hymns to the Maruts, Rudra, Vayu, and Vata (1891) (see link).
^[2] Schrijver, Indo-European *(s)mer- in Greek and Celtic published in Indo-European Perspectives (2004) (see link).
^[3] Griffith/Shastri, The hymns of the Rgveda (1995), p.398 (see link).

2 Nov 2009

A modification of Indo-Aegean, plus some new grammatical ideas on Minoan

I like to explore new ideas and test them as always. One of my ever-evolving ideas is on the idea that Indo-European and Aegean are related to a common Proto-Indo-Aegean ancestor datable to 7000 BCE. Or so I've been thinking up to now but...

I decided to explore a radical new extrapolation that's got a grip on my mind recently. What would be the consequences to my theories if Proto-Indo-Aegean were dated to as much as a thousand years later in 6000 BCE? The first interesting thing about this fresh perspective is that 6000 BCE is just about the time before Proto-Semitic began to affect Mid IE (MIE) according to my currently defined chronology. Another interesting thing is that if we take for granted a more Balkans-positioned MIE vis-à-vis the later Ukraine-positioned PIE proper, then it begs the question: Where would this theoretical Proto-Aegean of mine be sitting at this time? The most obvious answer would be that it would lie somewhere to the west and/or south of the Balkans in the general area that it historically emerged (see graphic above). Yet my theory also positions Old IE (OIE) back in the northerly territory occupied by later Late IE such that the geographical path from OIE to MIE to PIE looks like a meandering vee that points towards the Aegean Sea (see graphic below). This isn't problematic since nothing says that languages have to spread progressively in only one direction over the course of time. However, this pattern, if taken as correct for the sake of argument, teases in me a further idea that Aegean would have been brought to Greece and/or Turkey by that very southerly movement that brought Mid IE into the same trading zone. It's as if to say that what I call "Old IE" circa 7000 BCE is to be revised as a still-evolving Indo-Aegean and the beginning of the Mid IE period should be called "Old IE" at 6000 BCE. It's as if the temporary spread of an early stage of PIE to the Balkans and the spread of a related Aegean branch perfectly coincide to warrant further pondering.

Given the general conceptual arguments in favour of this deviation from standard, I went towards examining all the morphological what-ifs with even more profound consequences. The unfortunate problem with Etruscan, Lemnian and Rhaetic (and probably too with Eteo-Cypriot and Eteo-Cretan) is that no personal endings appear to be attached to verbs in these languages despite the fact that many features like the 1ps and its oblique form (mi and mini), demonstratives and the declensional system (ie. the demonstrative accusative, s-genitive, animate and inanimate plural endings) all find direct connections to PIE. If Aegean is related to PIE then something has happened to these endings and they've disappeared at some unknown point in time motivated perhaps by reasons that are lost in the mists of time.

I refuse to believe the answers aren't recoverable and I don't particularly like mist. I've been poring over Minoan texts recently and while very hesitant at first, I've been rethinking on the published but nonetheless speculative view by some that -SI and -TI are the 3ps and 3pp endings respectively. This is an obviously PIE-inspired interpretation and given the lack of success in translating Minoan with PIE values, we have reason to be skeptical.

Yet...

It's interesting to observe that if we stick by my values of the Libation Formula such that *una (U-NA) means 'libation' (cf. Etruscan un 'libation') with plural *unar (U-NA-RU), and *kan- in KA-NA-SI/KA-NA-TI is cognate with Etruscan cen- 'to bring', then not only do we have a perfectly sensible phrase "a libation was given"/"libations were given" that coincides with the fact that it's written on several Cretan libation tables, but if we take the variation KA-NA-TI in PK Za 11 to be correctly read and written on purpose by scribes to indicate a different inflection, then what we have here is a language with personal endings that apparently have not been completely lost! It would seem that -TI might indeed correlate with plural subjects while -SI would correlate with singular ones.

If we additionally corroborate this with CR (?) Zf 1 (an inscribed gold pin) where we find a perfectly Etruscoid sentence with the ubiquitous SOV word order and with intriguingly Indo-European-like verbal endings, A-MA-WA-SI KA-NI-JA-MI (*Amawasi kaniami 'I (ie. the pin itself) was brought for Amawa'^[1]), then we have a very exciting verbal system that might help crack the language: 1ps *-mi (cf. PIE *-mi), 3ps *-si (cf. PIE *-ti), and 3pp *-ãti (cf. PIE *-énti).

The reasons for this strange hodgepodge grammar, neither fully Etruscan nor fully PIE by any sensible definition, would then relate back to the modified chronology that I suggest above. Speculation? You bet. But worth a look, I think.

NOTES
^[1] Ego-focussed dedicatory inscriptions such as these were plentiful in later Etruria and were also found in the Greek and Faliscan languages as well. Read for example Pallottino, The Etruscans (1955), p.253 (see link) who testifies to the Faliscan inscription eco quto ... enotenosio ... 'I (am) the pitcher of ... Enotenus ...'.

26 Oct 2009

Searching for an etymology for Germanic *handuz 'hand'

First, let's get nonsense out of the way by letting a published author state the obvious about origins of the Proto-Germanic etymon *handuz 'hand' that are most implausible yet unfortunately popular among idle hobbyists online. In the words of A. Seidenberg in km, a widespread root for ten (1976):

"The effort to relate km or kmt to *handus, or, more generally said, to see a reference to the hands in the number words, is also ad hoc: there is not the slightest evidence, apart from similar speculations on the other numbers, that the Indo-European number words are derived from finger-counting."

These comments on poor methodology are as true today as they were then, regardless of whether this old tomfoolery is resurrected on page 316 of Mallory/Adams in The Oxford introduction to Proto-Indo-European and the Proto-Indo-European world (2006 ), albeit subsequently with mild argumentation against the idea.

What then is the current etymology? Apparently no consensus exists yet. For example, The Barnhart dictionary of etymology‎ (1988) says that no cognates of hand exist outside of Germanic. While it's immediately tempting to see an origin in PIE *gʰend- 'to grasp' which yielded Latin praehendere, Greek χανδάνειν and Gothic bi-gitan, formal sound correspondences between PIE and Proto-Germanic forbid us to assume a direct connection with the Germanic root. One would expect a hypothetical PIE u-stem **gʰóndus 'grasper; hand' to end up as **gantuz in Proto-Germanic but certainly not *handuz which rather suggests a non-existent PIE stem **kondʰ-u-. Evidently, these are not the plosives we're looking for and no direct link to Proto-Indo-European appears sensible.

So I had a sudden brainwave and the more I think about it the more sense it makes, although it's frustratingly hard to substantiate. Since there are already a few known Proto-Germanic terms borrowed from Latin in the early first millenium BCE after Grimm's Law had taken place (cf. Ringe, From Proto-Indo-European to Proto-Germanic (2006), p.296), it makes me wonder if one of them might have been our Germanic word in question.

For this crazy idea to stick, we require a Latin word *handus, but as the reader can tell by the asterisk, it doesn't exist (at least as far as I know). On the other hand, prae-hendere 'to seize, to grasp' does indeed exist and the prefix prae- 'before' is secondary. From this implied Latin verb root *hend-, we are certainly free to muse light-heartedly on how we can obtain *handus 'grasper' from it, and very curiously, noting on how it rhymes with the attested Latin word for 'hand', manus. It taunts me with the image of a northern Germanic community with a high degree of Latin bilingualism, inventing new words and idioms out of a faraway language. If only my Germanic-influenced Latin word *handus for proper manus were attested in Roman records, I might develop something more out of this thought.

28 Sept 2009

The PIE *to-participle in my subjective-objective model

I'm exploring an interesting idea involving participles, related to my previously concluded model of PIE conjugation involving two dimensions contrasting subjective with objective (ie. the source of hi- and mi-class respectively) and progressive with non-progressive. I had come to this model in order to explain how Anatolian-Tocharian dialects relate to the other dialects which I call Core IE. As with all things theoretical, I don't know whether I'm completely legitimized by the existing facts in exploring this idea, so take this all with a grain of salt and take thrill in the cerebral journey.

In a nutshell, if my quadripartite system distinguishes four sets of endings exemplified in the 1ps with *-mi (objective progressive), *-m (objective non-progressive), *-h₂ór (subjective progressive) and *-h₂e (subjective non-progressive), then it stands to follow that there may likewise be four non-finite forms, participles, corresponding to each of the four categories I describe. From the evident participle suffixes, we then seem to be inevitably led to the following system:

	objective	subjective
progressive	*-ónt-	*-m(h₁)nó-
non-progressive	*-tó-	*-wós-

However, Szemerényi informs us in Introduction to Indo-European linguistics (1996)^[1] that the PIE *tó-participle is just not present in Tocharian and Anatolian. From this absence of evidence, it's understandably concluded that the participle hadn't yet formed. Only once Anatolian and Tocharian parted ways would the emerging Core IE dialects create this new participial form.

This status quo account is admittedly very persuasive... as long as one forgets to question how such a suffix can be formed from known PIE grammar specifically with the required semantics to make it the prevailing participle form by far, above all other possible thematic suffixes like *-nó-, *-mó- and *-ló- among others which are also occasionally used. Why did all of the Core IE dialects agree to this one suffix with *-t-? I have a defiant answer: Maybe it had been a participle ending right from the start and that the Anatolian-Tocharian area were motivated to chuck this one ending away. But then, why?

Falling back on my recent insights on how Anatolian-Tocharian emerged out of my model (note too my later relabelling of "eventive" as "progressive" in this model), I realize one interesting motivation for a conjectural loss of this participle. Notice that my theory suggests that Anatolian-Tocharian dialects were developing tense out of a tenseless system, making the former progressive marker *-i a present tense marker. In effect, the four-way system of old was reshaped into a three-way system of mi-class, hi-class and middle. That means that one of the participles had to go, and guess which one! So these dialects must have ended up with a mi-class participle in *-ónt-, a hi-class participle in *-wós- and a middle participle in *-m(h₁)nó-. There would have been no longer any room for a *tó-participle in this particular evolution as it would only duplicate the function of one of the other three, hence a loss specifically in Anatolian and Tocharian of a now-redundant element.

NOTES
^[1] Szemerényi, Introduction to Indo-European linguistics (1996), p.323 (see link): "The suffix -to- is widespread in all IE languages except Anatolian and Tocharian." Well that was pretty straight-forward, wasn't it?

23 Nov 2008

Laryngeal abuse - Phonemes caught in the reconstructive crossfire

Today I'm here to warn you about the tragedy of laryngeal abuse. This is when you see a long vowel in some proto-language and a devilish thought comes to mind like "Gee, I wonder if that long vowel is underlyingly the result of a vowel-plus-laryngeal combo?" And then before you know it, you've gone and rearranged the entire proto-language according to your laryngeal-obsessed whims. Laryngeals are fun, but we have to keep a level head too.

I remain convinced that Bhadriraju Krishnamurti's version of Proto-Dravidian is one such example of this laryngeal abuse at work, and it seems to me that this can be quickly resolved by examining what a mess he makes out of the pronominal system of this language. As far as I'm concerned it's supposed to look like this (i.e. how most Dravidianists reconstruct it):

	singular	plural
1p	yān (obl. yan-)	yām / nām (obl. yam- /nam-)
2p	nīn (obl. nin-)	nīm (obl. nim-)
3p (reflexive)	tān (obl. tan-)	tām (obl. tam-)

However, Krishnamurti has proposed the following^[1]:

	singular	plural
1p	*yaHn	yaHm / ñaHm
2p	*niHn	*niHm
3p (reflexive)	*taHn	*taHm

He then suggests that the laryngeals disappear in oblique forms. However the process by which this happens is obscure and left unexplained. In contrast, the idea that long vowels are reduced when used in oblique cases or when preposed to another noun is natural and commonplace. For example, we may take note of French moi "me" versus enclitic m(e) "me, myself", the latter being preposed to verbs as the object (e.g. Elles m'aident. "They help me."; Vous me dérangez "You disturb me."). Since we know where French comes from (i.e. Latin, of course!), we know how absurd and off-track it would be to reconstruct Proto-Latin **me(H) "me" in ignorance of attested Latin, placing a laryngeal in there that appears and disappears conveniently like the Cheshire Cat without rhyme or reason.

I have to say that I just don't buy Krishnamurti's reworking of the pronominal system. Whether Dravidian ultimately has a few laryngeals lurking about is, to be fair, a seperate issue that may still hold true, but these pronouns surely don't contain any. To add them here makes analysis more difficult rather than less.

NOTES
^[1] Krishnamurti, Comparative Dravidian Linguistics (2001), p.336 (see link).

13 Nov 2008

Confused about PIE's intensive particle *ge

I'm so confused about the "intensive particle" in Proto-Indo-European (PIE) right now. The exact nature of the particle is related to my previous ponderings on uvulars and their Pre-IE origins. It seems that some Indoeuropeanists reconstruct *ǵe^[1] and some reconstruct *ge. Then there's also *gʰe which appears to be reconstructed alongside *ǵʰi as in the emphatic negation *né-ǵʰi "not at all"^[2]. All of them are supposedly "intensive" particles with the same function.

What makes this more confusing is that I'm pretty sure that the pronoun *h₁éǵoh₂ "I" has to be the product of *e, *ǵe [intensive particle] and *-oh₂ [old 1ps subjunctive]. Yet if so, everything in that word implies that the velar was originally *ǵ, not *g (see Paleoglot, The Origin of Indo-European Ego, Apr 07 2008). Yet if it started out as *ǵ, it can't explain what appears to be an intensive or punctual suffix *-g- used on verbs like *yeu-g- "to join" (c.f. *yeu- "to join") and *bʰoh₁-g- "to bake" (c.f. *bʰeh₁- "to warm"). Surely this is connected, no? It also seems suspect that a productive particle or suffix would have used such a marked phoneme (i.e. As I've stated earlier, *g is likely to me to be a uvular, creaky-voiced stop rather than a "plain" one as per traditional reconstruction). My instinct is telling me that it surely must have once been *ǵ (i.e. a plain voiced velar stop in the revised reconstruction) but then this denies a link to the verbal extension in uvular *-g-.

I'm so confused and so far I can't make heads or tails of it yet I know that all of these things must be connected somehow.

UPDATES
(November 13 2008) Corrected the definition of *bʰeh₁- from "to burn" to "to warm". It's just a slight technicality that doesn't affect my above reasoning.

NOTES
^[1] Beekes, Comparative Indo-European Linguistics (1995), p.222 (see link).
^[2] Both unpalatalized *gʰe and palatalized *né-ǵʰi with different voiced velars are shown boldly on the same page of Mallory/Adams, The Oxford Introduction to Proto-Indo-European and the Proto-Indo-European World (2006), p.69 (see link), emphasizing my point that something may be a little wonky with the reconstruction of this particle which appears to have too many possible forms: *ǵe, *ge, *ǵʰe, *gʰe, *ǵʰi or *gʰo.

	singular	plural
1p	yān (obl. yan-)	yām / nām (obl. yam- /nam-)
2p	nīn (obl. nin-)	nīm (obl. nim-)
3p (reflexive)	tān (obl. tan-)	tām (obl. tam-)