31 Oct 2007

More isogloss amusement for the linguistic nerd at heart

I have things to update regarding the Etruscan dictionary but the months from October to December are ugly times of the year when time is divided up into tinier and tinier morsels. There's no avoiding the ol' Samhain and to be frank, what an excellent excuse to have a drink (or five) with friends. I felt in the last entry I wrote, I only scratched the surface of isoglosses and the concept of "language waves". What I want to put out as an idea now is something that I've never directly come across reading. Perhaps there's some rare soul out there writing about what I'm about to explain or, just maybe, I'm actually original for once.

If we come to fully understand that languages are nothing more than overlaps of a million-and-one possible linguistic features spreading and shrinking across a given geographical region, then we should comprehend that Indo-European was never truly a single language and yet that its reconstruction is not without purpose either. The reconstruction of this language in light of language wave models makes prehistory clearer because we free ourselves from the trap of believing falsely that Indo-Europeans were a single people, single language or single culture while still ascertaining important details about the language(s) of that time period. It liberates us from a lot of sensationalist nonsense still written out there about so-called "Indo-Europeans", as if to say that the similarity of one's spoken languages was somehow in itself a factor that unified prehistoric people in all of their other aspects such as culture or genetics, let's say. To use the term "an Indo-European" is as imprecise as saying "an Asian", which is why I prefer to specifically say "Indo-European speaker" when speaking about the peoples who spoke these languages, while avoiding the term "Indo-European" altogether when speaking about archaeology where no one archaeological culture can logically match perfectly the linguistic region of the Indo-European speech area. Material culture and language are quite obviously seperate things and should seldom ever match.

Now, let's talk about using isoglosses in a diachronic way, that is to say, making note of features in IE and surrounding languages and how this might have changed over time from pre-IE to IE itself. For some reason, I've never heard tell anyone come up with this idea before but I can't imagine why such a powerful idea wouldn't be used. To illustrate, I amused myself by creating a casual proposal based on my general impressions of Indo-European's larger linguistic environment:

What I'm doing here as a demonstration of the hidden potential of isogloss mapping is keeping track of some general features that seem to recur over and over again in certain areas. I get the vague impression that certain areas have a "regional linguistic memory" and changes that occured long ago in another time seem to magically reappear in a different way. If my impressions are right, language waves would be the reason for that. Linguistic features are in essence independent entities from the languages that host them, bizarre as this may sound. As a result, I think that a feature can loiter in an area for thousands of years, only to then latch on to some other language and alter it. This occurs for a very long time until some unpredictable circumstance breaks the cycle of that feature in the region.

So in the diagram above, I toy with the idea that certain areas to the east and south, for example, had rich phonologies using things like ejectives, labialized sounds and palatalized sounds while also containing few vowels, such as the areas I propose for Abkhaz-Adyghe (AbAd) and Mid IE, the ancestor of Proto-Indo-European (PIE or IE). MIE is indicated by a circle with an arrowed line which points to later IE 1 and IE 2 in red. IE 1 represents the older branches like Anatolian and Tocharian while IE 2 are dialects such as Germanic, Italic, Hellenic and Indo-Iranian. Allan Bomhard had already suggested in Indo-European and the Nostratic Hypothesis (1996) that AbAd and IE may have interacted with each other in some pre-IE stage although he didn't supply strong enough evidence to prove it. Instead he resorted to using what was accessible to him then, Proto-Circassian, a specific branch of AbAd, as possible clues to further information. While the proof is still lacking, I just can't let go of that idea and I think that it's correct, implying that IE moved from east to west towards its general location in the North-West Pontic area. In fact, from what I theorize for Mid IE, just like AbAd, it must have had a richer consonant system than later IE (i.e. added sounds like *sʷ and *tʷ), although it lacked palatalization as a distinct phonological feature from what I can tell, and it had a reduced vowel system of *e and *a. (Note that I feel that *i and *u at this time were treated only as semivocalic consonants. I differ considerably from Bomhard's views in regards to the evolution of the vowel system and the origin of IE ablaut however and I suppose I should iron out a draft of my current views on the exact evolution of IE. No rest for the wicked.)

At any rate, back to the concept of regional linguistic memory, I've tried to illustrate also in this diagram how I think changes that occured in a previous stage of a language might actually recur centuries or millenia later in the same area in a daughter form of the tongue or even another language altogether. I think this has to do with bilingualism. When you look at the dialect areas above, what I'm encircling are general "quirks" that seem to show up in certain directions away from Core IE. For example, based on the diagram above, we could imagine that satem languages emerged first as Late Indo-European dialects that happened to venture into the "palatalizing" region, or in other words, the region where palatalized phonemes were distinct sounds in the phonologies of unknown ancient languages already spoken there that preceded IE. We can then notice that, lo and behold, Tocharian might have added new palatalized phonemes to its sound inventory because it passed through this same area! Overall, the use of palatalization seems to be a distinctly eastern feature. Cool? There's more.

Maybe some northern IE dialects merged *d and *dʰ together simply because they entered an area where only /t/ and /d/ occured in local tongues. To a person only habituated to /t/ and /d/ in such a region, the introduction of IE would be problematic to those who might have a strong tendency to pronounce Indo-European's *d and *dʰ both as /d/ in ignorance. Now imagine such bilingual people prospering from flourishing trade, having many babies, and then adopting Indo-European as their tongue. Suddenly we have a new dialect where *d and *dʰ have merged thanks to substratal influence. In this scenario, the lost language (substrate) affects the new dominant language before disappearing. In effect then, a language seldom disappears without a trace.

Then also, hopefully you may have noticed that I purposely show Indo-European and Aegean with shared isoglosses. Aegean is a language family from which I believe Etruscan, Lemnian, Rhaetic, Eteo-Cypriot, Eteo-Cretan and Minoan sprang. Minoan seems to contain "o" if the writing system is any indication but the rest, the Etrusco-Cypriot languages which center in Western Anatolia, seem to lack it. This is an interesting phenomenon in the region because not only did Anatolian IE languages rather coincidentally lose this vowel but there are also number of other ancient languages in Turkey that seem to have avoided this vowel too (e.g. Sumerian and Akkadian).

Perhaps these are vague ideas but you all should get the drift about how exciting isoglosses can be if you're a nerd like me. They have the potential of organizing incredible details about not only the familiar languages but all the dialects that died out without a trace (or so we might have erroneously thought in previous centuries). With those seductive thoughts, may you toss and turn as I do, blessed with chronic insomnia as you think about these proto-language puzzles for days to come. Hehehe.

28 Oct 2007

Language waves and the satem innovation in PIE

I want to talk about the awesome, yet unappreciated, power of isogloss maps. I find that a lot of people in general misunderstand Proto-Indo-European (PIE) to have been a single, unified language when in fact this could never possibly have been the case. At any given time, PIE would have been an assemblage of dialects from start to finish. Of course, when I say that, I will probably be misunderstood to have meant that the reconstruction of Proto-IE is invalid or futile, which I stress is absolutely not my position at all.

Rather I think that by conceiving of language change in a better way, as I will explain below, we can gain a lot more insight into protolanguages than is currently the norm. To me, the confusion about language evolution stems from a popular, antiquated notion of language change as represented by the all-too-famous "family tree" diagrams. We can see the concept in this picture or in this one. As the diagram implies, language divides into various new languages as time goes on much like how a tree grows new branches as it ages. This is a horrible way to understand language change.

A better way is to recognize language, not as a single entity, but rather as a package of features (whether they be grammatical features or phonetic features) that coexist within said language. Furthermore, these features may spread like waves over a geographical area much like waves in an ocean and therefore spread into other neighbouring dialects or even completely different languages. When one language affects another, dialectologists call it areal influence.

So now, if we can truly understand that languages are nothing more than the mere intersection of spoken features within a specific geographical area and that each of these features may be represented as independently spreading waves, then we can now understand what I mean when I say that PIE was never a single language and had always been a sea of regional dialects.

Now that we have that clarified, I want to explore a crazy idea. It's untested but even if it's wrong, it should give you an example of an as-yet underappreciated process using isogloss mapping to tease out interesting details about a protolanguage's history that would otherwise go unpondered. I want to posit an idea that the satem innovation of PIE was in fact caused by areal influence from a "para-dialect", that is, a dialect lying just outside the boundaries of PIE itself. A dialect, in other words, that was almost-but-not-quite PIE. One that only shared some features with PIE because it had split away at an earlier date than Anatolian or Tocharian while innovating new features. The only way I can fully explain how this intriguing idea would work is to illustrate this in a video using my skills with Macromedia Flash.

In this rough sketch presented in my video below, the p-Satem (or para-Satem) dialect is a hypothetical paradialect that could have perhaps provoked the Satem innovation in PIE. I toy with this idea to illustrate for you how language waves and isogloss maps can help us explore new details about ancient languages.

24 Oct 2007

Four Stone Hearth: Volume 26

23 Oct 2007

TLE 58: A reader challenge

Hey, fellow readers. I have a challenge for you. I want you to look at the following entertaining versions of a single inscription indexed by Massimo Pallotino as TLE 58.

D'Aversa, La lingua degli etruschi (1979), p.116:
mini kaisie θannursiannaś mulvannice

Agostiniani, Le "iscrizioni parlanti" dell'Italia antica (1982), p.201:
mini kaisie θanurśiannaś mulvannice

Georgiev, La lingua e l'origine degli Etruschi (1979), p.65:
mini kaisie θannur(s) sianna mulvanice

Olzscha, Interpretation der Agramer Mumienbinde (Klio, Beiheft 40) (1939), p.43:
mini kaisie θannursi annat mulvannice

Wow! Who knew that the academic mutilation of history could be so much fun? And as usual, there are no clear pictures of this object easily accessible to the general public. Too bad for you. So the challenge for you, the hapless reader, is to piece together which if any of the above versions of the same inscription are the right one. Write your answers with justifications in the commentbox. Good luck! Hehe.

21 Oct 2007

Proto-Uralic flora

And now a tasty nibble from a completely different dish. I found another snippet from Google Book Search, this time from The Uralic Language Family: Facts, Myths and Statistics (2002) by Marcantonio, p.181 which reconstructs the following trees and explains the reasons why this proves that Proto-Uralic was spoken at an early prehistoric date in the environs of the Ural mountains. Words reconstructed in the book are *kuse ~ *kose 'spruce, fir', *soksɜ 'Siberian pine', *ńulkɜ 'white fir, Siberian fir, silver fir' and *näŋɜ 'Siberian larch'. (Note that Juha Janhunen would reconstruct these a little differently based on his intriguing comparisons with Samoyedic and quite frankly I prefer his system. Click here for page 95 of the same book.)

Liber Linteus and religious formulae, part 3

(Cont'd from Liber Linteus and religious formulae, part 2)

The last thing that I want to talk about here is what might be behind the seemingly random interchange between the aforementioned patterns A, B and C of this particular religious formula.

In itself, there is no rational reason why the same three words of this list should be declined for grammatical case in three very different ways, even if they happen to convey the same overall meaning. Of course, there is always the possibility that it's just artistic whim on the part of the scribe, but I try to avoid vague ideas like this when I can find a stronger pattern. What I notice is that this religious formula is probably not to be understood in isolation, but rather is part of whatever phrase that precedes it. On its own, this religious formula seems to lack a finite verb form (which I believe is any form that is marked for tense). Here we only have enaś at the end of these phrases which I interpret to be a deverbal adjective meaning "everlasting", from the verb en "to last, endure".

To make a long story short, certain verbs go with certain patterns. Again, looking at the patterns and taking note of the words that precede it, it appears we have another interesting and larger pattern at work:

preceding phrase (verb shown in blue, formula shown in red)verb used with patternnominal inflection used in formula

Cis-um pute tul θansur haθr-θi repin-θi-c śacni-cle-ri cilθl, śpure-ri, meθlume-ri-c enaś.

putelocative -e + -ri
BIn ze-c fler θezince śacni-cś-treś cilθś, spureś-treś meθlume-ri-c enaś.θezincedirective -iś + -treś
CCis-um θesane uslane-c mlaχe luri zeri-c ze-c aθeliś śacni-cla cilθl, śpural meθlumeś-c enaś.mlaχeattributive /-l

It seems to me that the religious formula here, by showing the same indirect objects with different case markings for these different verbs that precede them, may be a key to an intelligent translation of these passages. Some verbs may not allow certain case endings simply because the semantics of the verb disallow it. And that, my friends, is all I have to say on that for now.

19 Oct 2007

Liber Linteus and religious formulae, part 2

(Cont'd from Liber Linteus and religious formulae, part 1)

The great irony is that in regards to the Etruscan language, we depend far too greatly on specialists from areas other than linguistics to give us a competent account of its grammar. What honestly does a historian, archaeologist or museum curator know about structural linguistics unless they are willing to devote themselves to this study directly? This is precisely why little of substance has been published on the language for decades and ludicrous claims continue to fester.

There are numerous examples of false claims so ad hoc that they were provably false before they ever made it to printed page. Despite what has been published, it's plain to see after a little deduction that -a or cannot be "imperatives" because this is only based on ad hoc comparisons with imperatives in Indo-European languages of similar appearance; there is no such thing as a "definite accusative" *-ni (an utterly desperate analysis of spureni based on ad hoc association with pronominal 1ps oblique mini "me") [1]; un is not the 2ps pronoun because both un and unχva, the inanimate plural, are attested in the same document (i.e. Liber Linteus); anan is not the 3pp pronoun "they" because of its inanimate genitive plural anancveś, and the absurd list of claims goes on and on and on. If we should all be completely confused about how the Etruscan language works, it's a by-product of a hundred years of shameful nonsense by irresponsible academics who have been more concerned with the number of pages in their books than the number of credible claims. It's because of this irritating confusion that I started this project in the first place. I was tired of my mind being held hostage by other people's unmethodological claims.

So to really understand a document like the Liber Linteus, we need to start down our own lonely path. Much of the Etruscan grammar has been arrived at through random look-alike comparisons to Indo-European at a time when it was believed that Etruscan was of that family and linguistics was in its infancy. We have to do a lot of slashing before we can get to the kernel of truth.

I want to freely explore a new explanation that might clarify why we have both śpural (a type-II genitive) together with śpureś-treś (which seems like a type-I genitive) without rhyme or reason. The key perhaps is in rejecting altogether the Indo-European-motivated term "genitive" and in seeking a more descriptive term that is consistent with its actual usage. Even Giuliano and Larissa Bonfante have admitted that the so-called "genitive" (often having a meaning of "of" or "from") is also used as a "dative" (i.e. a meaning of "for" or "to") [2]. Unfortunately, these specialists are too confused themselves to be of much help since they also claim -si to be a "dative" in the same book, albeit with unsettling question marks beside their creative analyses [3]. Let's reanalyse the Etruscan declensional system for nouns as follows:

nomino-accusative: (unmarked)
attributive: -s/-l
directive: -is
locative: -e
By renaming the genitive case as "attributive", we make it clearer that these endings are not just restricted to mere "possessives" or "ablatives" but rather we recognize their many other usages. The case serves the more general purpose of marking attribution, whether it be signalling ownership or reception (as through ditransitive verbs like tur "to give"). With inherently inanimate nouns like "stone" or "tree", neither possession nor the act of receiving is logically possible so in these cases, it's more linguistically natural that it indicates association. By also recognizing a case undifferentiated by gender which is distinct from but deceptively similar to the type-I genitive, we then explain away the coexistence of śpural and śpureś-treś. The former is attributive (with benefactive nuance here), and the latter is directive (indicating motion towards) with the addition of a postposition -tra which itself is declined. So despite the form śpureś, we may continue to understand spur "city" as a type-II noun using -l for "genitive".

We might now notice that the three variants of the aforementioned religious formula seen in the Liber Linteus document, despite superficial differences in declension, are likely to revolve around the same, general semantics:

1 (sacni)2 (cilθ)3 (spur)4 (meθlum)5 (en)

Ignoring for the moment any controversies regarding a proper translation of the phrase śacni cilθl, I would like to suggest that pattern A effectively translates as "for the [śacni cilθl], for the city and for the people everlasting" (locative -e + -ri = purposive/benefactive), pattern B means "to the [śacni cilθl] (and) to the city everlasting" (directive -is + -tra-is involves an action directed towards these things) and pattern C states "to the [śacni cilθl], to the city and to the people everlasting" (attributive -s/-l, implying an act of offering or dedication). What I suspect is that the three case patterns are united by a particular semantic overlap that in general conveys to us that these three things are the subject of ritual devotion. I would also gather that the alternation between cilθl and cilθś is due to the coordinated marking of both elements in the first noun phrase of pattern B with the directive case -is. I wonder if this "coordinative marking" or "double marking" occurs particularly in cases where two nouns in a phrase serve to convey something that as a whole is distinct from its individual elements (e.g. English "boob tube" conveys neither a "boob" nor a "tube"). But for now, just consider that last idea an idle, untested thought.

Now if this new analysis of Etruscan declension is kosher, then why are these three patterns used here? Is it artistic whim or is there something more afoot?

(Continue reading Liber Linteus and religious formulae, part 3)

[1] G. & L. Bonfante, The Etruscan Language: An Introduction (2002), revised edition, p.83
(click here). Here an unclear passage spureni lucairce is excised from its original context. You may remember this phrase from my previous post entitled Etruscan 'lucairce': How good is your eyesight? where I explained that the artifact is far too damaged to make certain of anything. One would think that a responsible academic would never use damaged text such as this as a direct example of anything but this didn't stop the Bonfantes however.
[2] G. & L. Bonfante, The Etruscan Language: An Introduction (2002), revised edition, p.82 (click here): "The genitive frequently has the function that we attribute to the dative: e.g. 'Venel Atelinas gave this to the sons of Jupiter' [...]".
[3] G. & L. Bonfante, The Etruscan Language: An Introduction (2002), revised edition, p.84 (click here). Notice that -si is claimed to be "dative of agent" and yet the absurdity of this analysis becomes quickly apparent in TLE 84 (Larθiale Hulχniesi Marcesi-c Caliaθesi munsle) where Larth and Marce cannot possibly be "agents" but are *deceased* recipients! Yet another hilarious Etruscologist goof-up that should never have been published. I got a million of 'em.

18 Oct 2007

My draft Etruscan glossary now downloadable on esnips as well

15 Oct 2007

Etruscan Dictionary Draft 003 now available

It's that time of month again! That's right! It's time for Draft 003 of my ongoing Etruscan language project, free for everyone to download on Lulu's website. The recent update has 912 entries. Yippee! I hit the 900 mark! For now, I'm making these revisals every 15th of the month, just to keep everything up-to-date.

Download Etruscan Dictionary Draft 003 here for free

For those of you that have been following along, I have made several changes and you can read my explanations for my changes by following this link or you can click on the "Draft 002" link in the blue bar above for reference in the future. I can foresee a few more drafts as I learn more and discover new intricacies about the language, which I recently have... but don't worry I'll blog about this very soon.

In fact, I still have a rant about religious formulae in the Liber Linteus. I haven't forgotten. It and the Tabula Capuana have been weighing heavily on my mind lately.

So, at any rate, enjoy the pdf, everyone!

13 Oct 2007

Liber Linteus and religious formulae, part 1

The Liber Linteus Zagrabiensis is a long religious document written on linen sometime around the 2nd century BCE (see Rix, Etruskische Texte, page 1, for its transcription). It was discovered in the 19th century by a mummy collector, of all things, who had managed to acquire one in Egypt with curious wrappings. It turns out after a long period of ignorance that the strips contained Etruscan on them. So the story goes, it was probably a genuine Etruscan book that made it's way to Egypt somehow to be used as wrapping for a mummy. As suspect as the history of this Zagreb mummy is, let's not assume even more hoaxes and controversies in Etruscan studies than there already are. Sufficed to say, this document has never been translated to any appreciable degree despite it being a potential wealth of information.

The thing that strikes me as the most interesting puzzle to solve is one of a few repetitious phrases that gets repeated in different case forms. The pertinent phrases are as follows:

1 (sacni)2 (cilθ)3 (spur)4 (meθlum)5 (en)

We can readily observe that elements 1, 3 and 4 are nouns that agree with each other on case endings. The final noun is given the conjunctive -c meaning "and", so we can be certain that this is a list of nouns. Element 5 is a form of the verb en (encountered in the form eniaca in the Pyrgi Tablets). It ends in an aspectual marker -aś. It is an infinitive (not marked with presentive -a or preterite -e) and probably functions more like an adjective, either modifying the preceding noun or modifying the entire list of nouns. We may also note that elements 1 and 2 belong to a single unit with śacni being the head noun.

Now that we have that straightened out, what do these case suffixes mean? In instance A, the case ending of choice is the locative in -e extended with the postposition -ri which is believed to be purposive, meaning "for"[1]. In instance B, another postposition is used, namely -treś. Its meaning is unfortunately unknown but some have labeled it an enclitic demonstrative without substance to back it up. Element 4 is missing in this instance. Finally, in instance C, it appears that the genitive case is being employed and depending on the gender of the noun, either the s-genitive or the l-genitive is used. Note also that throughout all of this, element 2 is marked in the genitive. Curiously however, it alternates between both s- and l-genitives. Why?

I will discuss more on this. Stay tuned.

(Continue reading Liber Linteus and religious formulae, part 2)

See Pallottino, The Etruscans (1975). On page 215, he suggests that śpureri is "probably 'to [or for] the city'". I would assert that "for" is the most precise value for -ri, that is, a postposition specifically identifying someone or something that benefits from a specified action.

11 Oct 2007

Reinterpreting the Proto-Indo-European velar series

I've pondered for years now about Indo-European (IE) phonology and the problems associated with it. It annoys me that IEists don't finally address them by updating their transcription system. One of these major issues with the sound system involves the so-called velar stops, that is, the reconstructed sounds *ḱ, *k, , *g, *ǵʰ and *gʰ.

You may wonder what's wrong with them? Afterall, satem dialects show us clearly a distinction between *ḱ and *k. Of course, we should have no issue towards this contrast at all. However the problem lies with how certain we are that *ḱ is phonetically realized as a palatalized consonant in IE. The nitty gritty of it is that we only assume that *ḱ is palatal since this is how it ends up in satem dialects where we find the palatal affricate in its place. So it's long been believed that since IE *eḱwos becomes Early Indo-Iranian *ećwos then it follows that IE itself had palatal consonants. Oh dear, what a careless leap of logic! The evidence from satem dialects merely shows conclusively that early Post-IE dialects had palatal consonants. Centum dialects lacked palatalization altogether. But how can we explain IE without palatalized sounds?

To explain IE without palatalized stops we need to first explain what we plan to do to fix IE. This is how we should update IE's sound inventory for the 21st century:

  • Palatalized stops are to be reinterpreted as plain stops:
    *ḱ, , *ǵʰ -> *k, *g, *gʰ
  • Plain stops are to be reinterpreted as uvular stops:
    *k, *g, *gʰ -> *q, , *ɢʰ
Why that's blasphemy! Delicious, isn't it? Now get ready. Here comes the reasoning behind it.

We know that pronouns and numerals contain the so-called palatalized stops exclusively and yet this is completely counter to the principle of phonological markedness. We expect simpler sounds to be used for such common words and yet clearly IE is a theoretical maverick: *eǵoh2 'I', *ḱo- 'this', *sweḱs 'six', *déḱm 'ten'. This in itself is clear proof that these sounds must be interpreted as plain, not marked with added palatalization.

There is also the consideration that the sequences *ke or *ek are rarely reconstructed for IE. By acknowledging that traditionally transcribed *k is in fact marked and pronounced further back in the mouth, and further by pairing it with *h₂, we realize that the reason for the lack of these sequences is because *q, and *ɢʰ colour vowels just like *h₂ does. So whenever we see *ka or *ak reconstructed, we should remember that they are in fact *qe and *eq (e.g. *kap- 'to seize' is to be understood as *qep-). I will go out on a limb and bet that the few words that are reconstructed with these sequences of *ke or *ek are falsely reconstructed, either because they are based on false evidence, because the proof points rather to its "palatal" counterpart, or because the vowel in question should be long (n.b. that long vowels resist colouring normally caused by neighbouring *h₂).

When we start pondering the effects of this reinterpretation, we begin to see a different story concerning the development of satem dialects unfold. We then realize that the Satem dialect area was the innovator, pushing the two stops *k and *q frontward in the mouth. Hence briefly, dialectal *ḱ and *k spread across a portion of the IE-speaking area where the rest of the dialects kept original *k and *q. This regional isogloss was now the seed for satem dialects like Indo-Iranian and Balto-Slavic. However, palatal velar stops are unstable and quickly turn to affricates, so it wouldn't have been long before and *k were heard throughout Satem IE as became the norm in later Indo-Iranian.

9 Oct 2007

"Mid Indo-European", Semitic and Neolithic numerals

Maybe I'm obsessive but this whole thing about bad Nostratic reconstructions and ancient numerals deserves more discussion. Lots more. I have a love-hate relationship with the Nostratic theory. On the one hand, I'm convinced by its basic premise of certain language groups being related together in the past 15,000 years, and yet I'm also irritated by the results arrived at by people who don't seem to take enough time to work out the details. I especially appreciate some of Allan Bomhard's contributions to Nostratic and yet I'm also left wanting for something more in-depth from him. I want to know exactly what happened in the past without it being doctored up with wishful, half-thought-out thinking and I don't believe for a second that we know all there is to know.

Regarding Bomhard's general reconstruction of Proto-Nostratic, I believe that many of these "Nostratic" roots are not genuine. However, I also think that some of these listed items may rather be potential evidence of loanwords adopted from Proto-Semitic (PSem) into a stage of Pre-Indo-European (Pre-IE) . To me, the example of IE *septm̥ from PSem *sabʕatum is the clearest and most undeniable case of Pre-IE borrowing, which is why I must sound like a broken record when I repeat it so often. So now let's get serious and propose something more realistic than 15,000-year-old numerals.

I suggest that the likeliest time for such an adoption of borrowings is the height of the neolithic around 6000-5500 BCE when trade is known to have flourished across Eastern Europe and the Near East. The neolithic was not just about a wide network of traded goods, but a newly expanded exchange of ideas and a greater sharing of common religious beliefs across larger spans of geography. This I believe would be the main reason behind the spread of "7" from Proto-Semitic into Pre-IE and other languages. Marija Gimbutas wrote about the neolithic period, although in my view she sometimes got too corrupted by feminist revisionism to be taken seriously. For instance, it's too simplistic to say that Indo-European speakers were all patriarchal warriors and native Europeans were all matrifocal pacifists. This sensationalism sells lots of books but the study of ethnology is far more complicated than this modern idealism.

During the neolithic, I envision a network of various groups speaking a number of Pre-IE dialects over a large territory surrounded by some "Para-Pre-IE" dialects (i.e. indirect "cousins" of IE that were later taken over by expanding IE) and non-IE languages. Pre-IE speakers would also have had a number of different traditions, belief systems and genetic origins dependent on the region one is speaking of. The core of Pre-IE would have been the areas west of the steppes. I seperate the Pre-IE stages of Indo-European arbitrarily into three sections to keep things tidy in my head:

Old Indo-European (OIE) - 7000-6000 BCE
Mid Indo-European (MIE) - 6000-5000 BCE
Late Indo-European (LIE) - 5000-4000 BCE

I use Proto-Indo-European (PIE) to refer to the very last state of the language before it began to fragment into dialects like Proto-Anatolian. I believe that it was MIE that first adopted Semitic vocabulary, including a few numerals. In this chronology, MIE is a stage of Proto-Indo-European immediately before the event of Syncope (i.e. the point at which unstressed vowels were dropped or reduced in all positions causing clustering and important changes to IE phonotactics). I place Syncope at the beginning of the Late Indo-European period, circa 5000 BCE. Before Syncope, MIE had far less clusters than LIE and had a predictable accent fixed on the penultimate or antepenultimate syllable. Now on that note, I would like to shamelessly propose my following theory for discussion that I've been developing for years:
wordMIE (PIE)Semitic
'three'*tareisa (*treis)*θalāθu
'six'*sʷeksa (*sweḱs)*šidθu[1]
'seven'*septam (*septm̥)*sabʕatum

As we can see, the pre-Syncope vowels are necessary to fully understand what has happened. Having only *e and *a to fill syllables in MIE, final *-a was pronounced as schwa, thereby mimicking the Semitic nominative in *-u. The PSem stress accent was probably also predictable, being placed on the lastmost, non-final "heavy syllable" (a syllable that was either closed (CVC) or contained a long vowel), or failing this, the accent was placed on the initial syllable. In "three", the long front vowel of PSem was naturally heard by MIE speakers as a diphthong *ei since this was the closest approximation in a language without long vowels. The Semitic dental fricative () was normally interpreted as initial *t- or medial *-s- in MIE (shown in both "3" and "6"), both of which are again natural approximations in a language that lacks this sound (n.b. consider how many French speakers pronounce voiced /ð/ in "that" as /z/ or /d/ instead).

The word "six" needs further explanation because it has confused many linguists as to why it should be that the Semitic cluster *-dθ- ended up as *-ḱs- in PIE[2]. First of all, we should notice that PSem *d is not the same phoneme as PIE *d. The important difference is that PSem *d was alveolar (as in English) while it was dental in PIE (as in French). This means that the Semitic sound as well as the following dental fricative were pronounced further back in the mouth than IE speakers were used to. In its place then, a velar stop would be an understandable replacement for the dental stop here and coincidentally, the *s in *-ks- would have necessarily been alveolar next to a velar stop since it's near impossible to pronounce a dental *s immediately after retracting the tongue. So now we can see why this was an optimal solution for IE speakers and furthermore there are many borrowings in other languages where stops are switched like this (note the history of the name Carthage). There is also the fact that from the perspective of markedness, to make a long story short, PIE *ḱ must logically be reinterpreted as a plain velar stop *k (not palatalized) while *k must have been a uvular or pharyngeal *q. However until the traditional notation is abolished, the topic of velar stops in IE will remain confusing and misunderstood.

I've probably raised more questions than answers with this topic of Mid Indo-European but hopefully this will inspire more discussion on the topic of Pre-IE because I think this untouched aspect of Indo-European linguistics is full of interesting possibilities. My roughly hewn theory may not be perfect but I think this is a better answer to the problem than Bomhard's implausible Nostratic roots, *sʷakʰsʷ- "six" and *sab- "seven" [3].

[1] Semitic reconstructions are from Gray, Introduction to Semitic Comparative Linguistics (1934). p.70. As an interesting aside, one may appreciate Klimov's Etymological Dictionary of the Kartvelian Languages for a run-down on Proto-Kartvelian šwid- "seven" which is derived from a Semitic masculine, non-mimated form of the numeral, *sabʕatu (Akk. šibit).
[2] Page 106 of Bernal Martin's Black Athena: The Afroasiatic Roots of Classical Civilization is a perfect example of how many authors confuse rather than inform us on the topic by simply offering a raw dump of completely conflicting ideas that fail to answer to any appreciable degree how the words might or might not be plausibly related.
[3] Bomhard/Kerns, The Nostratic Macrofamily: A Study in Distant Linguistic Relationship (1994).

1. (Oct 9/07) Please note that while some may feel that the realization of PSem as MIE *sʷ is strange, there is precedent in English, French and Italian pronunciations of the sh-sound /ʃ/ as /ʃʷ/. See Ball/Müller, Phonetics for Communication Disorders in Chapter 14: English fricatives. Based on this, we may surmise that PSem was similarly pronounced as /ʃʷ/.

8 Oct 2007

Thoughts on Nostratic, Semitic 'seven' & neolithic trade

Numbers in protolanguages are one of those things that intrigues me for some reason and there are tonnes of things to talk about on this subject alone. Particularly when discussing prehistory. Allan Bomhard reconstructed a Nostratic root *sab- for "seven" (see Bomhard/Kerns, The Nostratic Macrofamily: A Study in Distant Linguistic Relationship, 1994. p.361) which, I believe I've stated before, I don't buy for a second.

As much as I think it's worthy to reconstruct protolanguages and even to think seriously about long-range comparative linguistics, I'm not in favour of loose and ill-informed etymologies. This is one of those badly reconstructed items that needs to be undone since it's a fact that Proto-Semitic (PSem) *sabʕatum is reconstructed as a valid, masculine form of the numeral "seven" and it's also a fact that Proto-Indo-European (PIE) "seven" is *septm̥. It's self-evident that Indo-European had borrowed this word from Proto-Semitic peoples sometime before 4000 BCE since the word in Indo-European is unanalysable while in Proto-Semitic, the word is known to be formed from the numeric root *sabʕ- This speaks volumes about a widespread neolithic trade between speakers of both Indo-European and Semitic languages, not to mention a network of other protolanguages.

However, little else is said about this and IEists seem too busy with other things than to explore how the Pre-Indo-European dialects sat within the context of prehistoric society and the budding Mediterranean economy of the 7th and 6th millenia BCE. I seem to be one of the few even mentioning any of this at all. And I can't fathom why because it's a fascinating story.

Well, it turns out that our Jewish siblings at Forward are also aware of this fun linguistic factoid. They offered a funny article in December 2005 about the false Hebrew etymology for the word "British" in Is British Ish Brit?. After explaining how real linguistics words and renouncing these eyeball etymologies, they then mention:

"Yet even the invention of wine came at a relatively late stage of human development. There are some Hebrew-English word connections that may go back further. Take, for example, Hebrew sheva, “seven”: While its resemblance to the English numeral might appear a pure coincidence, there are reputable linguists who think that it isn’t and that sheva and seven’s oldest ancestors, conjectured Proto-Semitic sab`atum and conjectured Proto-Indo-European septm, may owe their similarity to contact as much as 10,000 years ago between early Indo-European and Semitic speakers in or near the eastern Mediterranean."
Yep, and it's very correct. For those not caught up on the subject of neolithic languages, you may be forgiven if your mind may stray and start envisioning a crazy dialogue between Indo-European and Semitic traders:

Indo-European guy: So tell me, Ariel, do you have the bottles of wine my village requested?
Semitic guy: Yes, Hans, right on schedule. Straight from a village in Turkey. But oy! Was it ever a pain in the tokhes sailing down the Danube today. The winds were blowing and my sail almost broke off. Took me almost sab`atum days to get here just from one of your local villages upsteam! But listen to me, don't I sound like a pitshetsh. Forgive me, how was your day?
Indo-European guy: Say, what was that you said? Septm, what does that mean?
Semitic guy: Oh, I said, sab`atum, you know, the numeral after... how you say... sweks?
Indo-European guy: Ah yes! Hahaha. You know, Ariel, your word is so much better than our word. I'm tired of saying "five plus two" all the time, so from henceforth I shall say septm like you.
Semitic guy: Haha, wonderful, however you're pronouncing it all wrong. What's with you crazy Indo-Europeans and your inability to pronounce a simple pharyngeal?
Indo-European guy: Haha, let's talk more over a cup of medhu.
Semitic guy: Say what now?
Putting jokes aside, Indo-European speakers were undoubtedly a varied lot of people, differing from region to region in appearance, customs and dialect. Same too for Proto-Semitic speakers. Somewhere in the middle they met.

And now to offer my theoretical explanation of events that transpired. My personal suspicion is that a fair amount of Proto-Semitic (PSem) vocabulary was adapted into Pre-Indo-European sometime between 6000 and 5500 BCE because of a network of neolithic trade going on. The Semitic numeral for "seven" would have had stress accent on the first syllable: *sábʕatum. It was then adopted into Pre-Indo-European (Pre-IE) as *séptam. You see, being that Pre-IE speakers didn't have pharyngeals in their language, the sequence -bʕ- just sounded like a strange-sounding "p". The second "a", being unstressed in PSem, was probably pronounced as a short schwa and thus explicably inaudible to some Indo-European ears. PSem *a would have been a front vowel like the "a" in "cat" in order to explain its change to *e in Pre-IE.

Over time, the stress accent in Pre-IE eroded unstressed syllables, producing *séptm̥. This later changed to *septm̥ when the accent shifted to the last syllable to match the accent pattern of the neighbouring numeral *oḱtō "eight". This explains why a reduced syllable with only a nasal vowel came to have accent. Yet then why did they borrow the numeral and why is this same Semitic numeral also borrowed in other protolanguages like Proto-Kartvelian[1]? I think this was simply a matter of religious connections that are lost to us. Later Babylonians associated certain numbers with deities and I suspect that a precedent existed in the neolithic, revolving particularly around the numerals "six" and "seven". These numbers must have symbolized someone or something that was widely thought to be important enough to extend beyond the many cultures and languages of Eastern Europe and the Near East. So this is why I think that the significance of these numerals was a matter of early religious beliefs.

So when you consider the potential for many of these Indo-Europeans to have been borrowed during the neolithic from Proto-Semitic, it makes many of Bomhard's reconstructions questionable, particularly those based solely on data from Proto-Semitic and Proto-Indo-European. On the other hand, I wonder how many of Bomhard's "Nostratic cognates" could be reinterpreted as evidence for Semitic loans in IE. These are all things that make me go hmmmm...

[1] Proto-Kartvelian *šwid- "seven". According to Klimov (Etymological Dictionary of the Kartvelian Languages, page 251), "a closer relationship of Akk. šibit 'seven' is quite evident".

5 Oct 2007

Etruscan nesl, TLE 515 and other random Etruscan stuff

I've been lying low lately, listening to Björk and 80s music as I use my spare time sniffing for more info on hundreds of Etruscan inscriptions for myself. (And hopefully one day, I'll find clear, colour photos of some of these very well hidden artifacts!) The Etruscan language and how it's been analysed so far shows one huge problem after another so I'm just going to rant about my journeys thus far with the Etruscan language.

So I was doing a search for TLE 515 the other day. In that inscription, there's a phrase tular hilar neśl inscribed on a cippus which got me thinking more on what neśl and related items really mean. Being firmly skeptical of what Etruscologists have been claiming so far, given all their other contradictions floating about, I presumed for myself that it was a praenomen based on TLE 572 (ca śuθi neśl amcie titial [...]) . After doing a little audit on this word however, I found it inscribed on the Lead of Magliano (TLE 359) as well: ... hevn avil neśl man murinaśie ... . So I must adapt my hypothesis. It must be a word then because it's unlikely for a praenomen to be dropped in a phrase in that context.

Larissa Bonfante merely claims exactly what several academics before her have claimed for a full century now, that neśl refers to a dead person. She also publishes a nebulous translation of nesna as "belonging to the dead" with a question mark trailing after it (Bonfante, Reading the Past - Etruscans, 1990. p.61 in the section Appendix 2 - Glossary of Etruscan Words). In the context of the original inscription with nesna in it (TLE 372: Θestia Velθurnas nesna), if we are to pursue her avenue of reasoning, it should then be better translated as "sepulcher" (hence "Thestia Velthurna's sepulcher") given its archaeological context. It hardly means "Thestia Velthurna belonging to the dead" afterall so it's reasonable to feel uneasy about Bonfante's superficial understanding of the language.

I have to admit then that it might seem okay to associate a root neś- with all things dead. However, it doesn't make me feel entirely secure when I see books from as early as 1883 claiming the same thing verbatim, along with unfortunately added details about its unlikely connection with Greek's root for death, nek- (i.e. as in words like nekus and nekros, see Etruskische Forschungen und Studien, ed. Deecke & Pauli, 1883. p.235). Have we not progressed at all since then? Apparently not. When I see that, I start getting Greenberg-itis, the kind of rash one gets when reading claims of translations built on whimsy and on "eyeballing" for subjective similarities. It's now understood for some time now that Etruscan is not Indo-European at all. Oh do I hate privileged people with doctorates abusing their certificates on mass comparison nonsense like this. It sends me round the bend.

It's even more curious a translation when we consider the previously mentioned phrase ca śuθi neśl which then would mean "this grave of the dead person" according to most experts. The question that pops up in my mind is what other sort of grave is there? Have you ever heard tell of a "grave of the living"? Putting jokes aside, there must be something lost in translation here but I'm unsure yet what would make better sense.

Then, let's see, what else do I have to rant about? Oh yes... I have to modify the entry of the verb θes- 'to dawn' to an intransitive verb. Another silly booboo. As I was changing it though, I realized that its participle form then must be *θesθ. The significance of that is in relation to the attested word θesθu which occurs in the sentence of TLE 329: Aχlei Truies θesθu farce. I translate it as "eastward" (i.e. "to the direction of dawn"). What I find interesting is that the word is then possibly built on this hypothetical participle *θesθ. If so, I wonder if hinθiu "below" isn't built up in similar fashion, from *hinθ "below" (found declined in locative cases as hinθa and hinθθin) which in turn may be from the participle of an intransitive verb *hin "to be below".

I'll update the Draft 002 Ammendments page soon, but I have to sleep now. I have a haircut and a thanksgiving dinner to go to tomorrow.

4 Oct 2007

The Latin Verb Conjugator

There are some things on the internet that just make you wonder. For me, that "something" was recently a contribution by a person calling himself "John T. Wodder II". Last year, he offered cyberspace his programmatical invention which he calls a "Latin Verb Conjugator", written in Perl (a computer language used for programming, particularly for online purposes). As he describes it, "[...] this script will create a synopsis of a Latin verb in a specified person & number and will output the results as a LaTeX document." Wow, and I thought I was a propellerhead. Lol!

While most of you won't know what the hell to do with it, I mention it here because it makes me think, "Gee, the human brain, complex is. And so insufficient to explain it, the computer is." (My inner voice sometimes sounds a lot like Yoda.) Of course, I'm sure that the program can be compressed into something more versatile for other languages, but maybe Wodder doesn't yet appreciate the broad application of structural linguistics. Still, I dare anyone to actually think as concretely about Latin conjugation and take hours out of their day to make a functional program of their own. I admit I'm just too lazy.

2 Oct 2007

The Tower of Babel

Sergei Starostin, before sadly passing away from a heart attack at the early age of 52, headed the Tower of Babel Project, an online project whose name is in unfortunate reference to a Christian biblical myth about language origins as given in the book of Genesis. Ironically, much like the bible and its unquestioning literalists, many linguaphiles online are in the habit of quoting Starostin and holding his theories in great esteem while ignoring all the deep problems of logic that are immanent in them.

Now, before you believe in wiki-haste that I'm just a nagging oddball, please note that my view is on the side of academics who have large issues with Starostin's work. There is no conspiracy against Starostin and his followers. We don't have a hate-on for him as a person and I'm sure he had a jovial spirit by all accounts, but these critiques arise because his theories are simply not tenable and too far-flung to be of use to a disciplined linguist. If people feel too personal about linguistic critiques, it's their issue to deal with.
  • http://www.accessmylibrary.com/coms2/summary_0286-25704408_ITM
    "With fewer than 300 linguists in the world doing serious work on long-range comparisons, the discipline is small and perennially insecure about its scientific standards. Given the dearth of rigorous proof for some of Starostin's assertions, many American linguists felt within their rights to dismiss his research or at least to exclude him from their conferences and symposiums."
So let's see why this is the case from examples we can see within Starostin's own project. At his site, a myriad of proto-languages seem to be tackled all at once but it centers around his personal baby: "North Caucasian" (note that the name refers strictly to the languages of the northern regions of the Caucasus mountains, not to a racial term). Throughout his life he tried to force the two language families called Abkhaz-Adyghe and Nakh-Daghestanian into the same, proverbial, round hole. The view of actual Caucasian language specialists is that Abkhaz-Adhyge (AA) and Nakh-Daghestanian (ND) are entirely different in structure, phonology and word order. While AA tends towards simple three-vowel systems (some even claim two-vowels), ND is radically rich in vowels and consonants. The two families couldn't be any more different from each other. Plus, any similarities between the two could easily have been caused by strong areal influence between the two within such a small region and over a large expanse of time.

No matter. Starostin was unfazed and he set to work to create a hybrid language, more like a conlanger than a comparative linguist. He no doubt figured that his language would serve as a valid ancestor to both language families. Only one problem. It only works if you ignore rigourous methodology. Here are just a few flaws that I notice with North Caucasian that proponents just don't care enough about:

1. Violations of phonological markedness are everywhere
I think Starostin had a "diacritic addiction". For example, all pronouns like first person *zō violate markedness because plain *z is so rare in his convoluted phonology while palatal *ź is used to explain almost everything (see his list of 'z words' here) where his diacritic bias is self-evident. Many phonemic exotica are far too often employed, such as *ƛ̣, to make up for his shortcomings in taking the time to adequately demonstrate sound correspondences (e.g. the ambiguous reconstruction *Ł_ĕɫV̆ with link here). Through the use of contrived sounds as a smokescreen, void of a discernible phonological structure that all human languages have, he can freely connect different etyma together no matter how large of a leapfrog we have to hurdle to swallow it, making it seem to laymen as though North Caucasian has a lot more evidence behind it than it actually does.

2. The pronominal system is unnatural

Considering that his reconstructed pronouns curiously use only voiced consonants, an attentive linguist might consider the effects of sentence-internal lenition on grammatical elements and more sensibly reconstruct unvoiced *s for *z in the 1ps pronoun or *t instead of *d for the 2ps pronoun. One might also reduce the bloated phonemic inventory to a more manageable level that could then be finally accounted for systematically by demonstrable proof, instead of leaving it all hanging as an empty assertion.

3. Shoddy claims in more understood language groups of his database compromise his credibility.

Nothing could be more far-flung than Altaic **séjra "three" (link here) when Altaicists reconstruct *göl- "three" on more direct evidence (Mongolian gurav, Japanese kokono-). Despite the fact that it is agreed upon by specialists that Proto-Dravidian is reconstructed without voicing contrast in stops (read here), Starostin took an anarchist approach by representing Dravidian with them (link here).
The number of protolanguages in his database and the wide variety is impressive. The quantity of seeming information alone would cause many to believe that his work is exceptional. However, quantity is not quality. I think that what the issue was with Sergei Starostin was that, like many language lovers, they failed to see that linguistics is a science and not a form of artistic expression.