More on Papuan Languages

by Bill Foley, as written for the International Encyclopedia of Linguistics 2003.


Somewhere close to a quarter of the total of the world's languages are spoken in the New Guinea region-about 1100 languages. The New Guinea region as defined here includes the whole of the island of New Guinea and offshore islands like New Britain, New Ireland and Japen and Biak, as well as the adjoining areas of eastern Indonesia, especially the islands of Timor, Alor and Halmahera. Within the New Guinea region are found two large groupings of languages: Austronesian languages, which belong to the far flung Austronesian language family stretching from Southeast Asia to Hawaii and number about 300 in the New Guinea region, and Non-Austronesian or Papuan languages, restricted to the New Guinea region and numbering some 800 languages.

Unlike the Austronesian languages, the 800 or so Papuan languages do not constitute a single, genetically unified language family (hence their common negative characterization, Non-Austronesian), but rather are organized into several dozen different language families at this point in our knowledge. With further careful comparative work some of these families will undoubtedly by combined into larger genetic groupings, as the Celtic, Germanic and Slavic families combined with others to form the Indo-European family, but at this point such claims are speculative; the careful reconstructive work still largely remains to be done.With such a high number of Papuan language families, it is not surprising that, on the whole, they are rather small. Compared to the gargantuan Austronesian language family, with over 600 members, the average Papuan language family has around 25 members. Further, the number of speakers of individual Papuan languages also tends to be small, averaging under 3000. Even the largest Papuan language, Enga, has only 200,000 speakers, but many Papuan languages have less than 100 speakers, some even less than 50.

Papuan village

A Papuan village.

The small size of many Papuan speaking speech communities has often led to persistent multilingualism in the language of adjoining communities, the development of trade jargons for interlanguage communication or the shifting of language allegiance to languages of more powerful or economically advantaged neighbours. In such a complex, fragmented linguistic situation, Papuan languages not unexpectedly exhibit a pattern of enormous cross-influence in all areas. All types of linguistic features, basic vocabulary, pronouns, grammatical patterns, discourse styles can be and have been borrowed from one language into another. This makes the establishment of genetic links among Papuan languages doubly difficult: with no documentation for the vast majority of them older than 50 years, it is problematic indeed to sift what is true genetically inherited material from what is borrowed from other languages, especially borrowing from genetically related contiguous languages or borrowings centuries old from now deceased languages. Consequently, comparative linguistics in Papuan languages must proceed with the greatest care and the utmost rigor.

It would appear that bound morphological forms are the most resistant to borrowing (again, however, not entirely immune), so that bound morphological forms that appear cognate are the most reliable guide to genetic relationships among Papuan languages. The fact, for example, that a great swath of languages in New Guinea from the Huon Peninsula to the highlands of Irian Jaya mark the object of a transitive verb with a set of verbal prefixes, a first person singular in /n/ and second person singular in a velar stop, is overwhelming evidence that these languages are all genetically related; the likelihood of such a system being borrowed vanishingly small. Careful comparative work along these lines in Papuan languages is now in its infancy. We as yet know very little about these wider genetic relations, and this is a major challenge over the next few decades.

Classification of Papuan Languages

With the above provisos in mind, let us survey the major Papuan language families. The largest family is the Trans New Guinea family, typified by the object prefixes mentioned above. This stretches across the island of New Guinea from the Huon Peninsula of Papua New Guinea in the east to the Paniai Lakes region of highlands Irian Jaya. The total number of speakers of languages in this family is in excess of 650,000 speakers (or around 20% of total Papuan language speaking population of around 3-4 million). Groups of languages, belonging to this family include the Finisterre-Huon group (around 65 languages with over 130,000 speakers), the Eastern Highlands group, consisting of the four Kainantu languages (46,000 speakers) and the eight Gorokan languages (close to 200,000 speakers), and the Irian Highlands group, consisting of the six Dani languages (250,000 speakers) and the four Paniai Lakes languages (100,000 speakers). It is also highly likely that the large Madang family of over 80 languages (with some 80,000 speakers), spoken in the Papua New Guinea province of the same name, is also part of this Trans New Guinea family.

There are a number of other language families which may also be part of this large Trans New Guinea family, but the relationship, if any, is less obvious than in the groups mentioned above and has yet to be demonstrated through careful comparative work. These include the Enga family, spoken in Engan Province and adjacent areas of Papuan New Guinea. It includes the Papua language, Enga, with the largest number of speakers (over 200,000) and the total number of speakers of the eight languages in the family exceeds 400,000; the Chimbu family of some ten languages spoken in Chimbu and Western Highlands Province of Paua New Guinea, also with over 400,000 speakers; the Binandere family, spoken in Oro Province of Papua New Guinea (over 50,000 speakers of fourteen languages); the Angan language family of the Eastern Highlands, Morobe and Gulf Provinces of Papua New Guinea with twelve languages and close to 70,000 speakers; the Ok family of the mountainous hub of New Guinea around the border of Irian Jaya and Papua New Guinea and spilling down into the southern adjacent lowlands, comprised of fourteen languages and some 50,000 speakers; the Awyu family, situated to the south and west and the Ok family and with probably nine languages and 20,000 speakers; the Asmat family of the central south coast of Irian Jaya to the west of the Awyu family and probably comprising a genetic grouping with both the Ok and Awyu families, containing four languages and over 50,000 speakers; the Mek family, spoken to the west of the Ok family in the mountainous interior of Irian Jaya, with about ten languages and 50,000 speakers; and some small language families of the tail or southeast section of Papua New Guinea like the Koiarian and Goilalan families, the number of language families involved being seven with thirty-six languages and 75,000 speakers. If all of these language families are indeed ultimately demonstrated to be part of the far flung Trans New Guinea family, then it will be comprised of close to three hundred languages and 2 million speakers, no less than 50% of the total Papuan speaking population.

Sepik landscape

The Sepik River. The largest river in PNG, many small villages sit along its banks.

As large as this putative Trans New Guinea is, and undetermined its membership, there are still many Papuan language families which do not on present evidence show any indication of belonging to it. This seems especially true of the lowlands areas of the north coast of New Guinea which tend to be extremely complex linguistically, with a number of distinct language families. In the Sepik-Ramu basin of the north coast of Papua New Guinea, the major groups are the Lower Sepik-Ramu family, spoken along the lower reaches of the Sepik and Ramu rivers and adjoining riverine and coastal regions, consisting of two sub-groups, the Lower Sepik and Lower Ramu families, which are typologically very different, containing six and nine languages respectively and some 15,000 speakers; the Middle Sepik family, found in the mid region of the Sepik River and adjoining areas, comprised of the core Ndu family with seven languages and close to 100,000 speakers and some fifteen (30,000 speakers) languages upriver or north of it, and possibly the Sepik Hill group of the dense swampy rainforest country south of the river consisting of fifteen languages and 7,000 speakers; the Torricelli family, spoken in the Torricelli Ranges between the north coast and the Sepik River, eastward through the hills north of the Lower Sepik region and into the coastal region of western Madang Province, comprised of nearly fifty languages and 80,000 speakers and typologically highly divergent from other Papuan languages of the Sepik-Ramu basin or indeed the Trans New Guinea family; the Sko family, spoken along the north coast of New Guinea, straddling the border between Irian Jaya and Papua New Guinea, consisting of eight languages and around 7,000 speakers.

Along the north coast of Irian Jaya and the Mamberamo basin the following major language families are found: the Sentani family, spoken immediately to the west of Jayapura, with three languages and about 10,000 speakers and possibly a further member of the Trans New Guinea family; the Lakes Plain family, a phonologically highly exotic family spoken in the flooded plains area of the basin of the Mamberamo, with some twenty languages and some 6000 speakers; the Cenderawasih Bay family, spoken on Yava Island in Cenderawasih Bay and the adjoining mainland, with five languages and some 12,000 speakers, and probably forming a large genetic grouping with the Lakes Plain family; the East Bird's Head family, spoken on the eastern side of the Vogelkop (Bird's Head) Peninsula in the far west of Irian Jaya, with four languages and 17,000 speakers; and finally, the Western Bird's Head family, found on the western side and central area of the Vogelkop Peninsula, with twenty languages and 45,000 speakers (this family has been proposed to be related to Papuan languages further west in the eastern Indonesian island of Halmahera, but this has yet to be demonstrated conclusively). There are many other Papuan language families than these, too many to list; the lowlands areas of New Guinea are particularly complex, with many small language families with relatively low numbers of speakers. This is especially notable in the West Sepik and Western and Gulf Provinces of Papua New Guinea and adjoining areas of Irian Jaya, which also contain a number of isolate languages, at this point unclassifiable into any larger language family. One notable group from these areas is the Marind family of the south coast of Irian Jaya close to the Papua New Guinea border, which contains six languages, with a total number of speakers in excess of 20,000. Finally, one other important family not yet mentioned is the Bougainville family, found on the large Papua New Guinea island of the same name and comprised of two not very closely related sub-groups, the North and South Bougainville families, which contain eight languages with over 40,000 speakers and is possibly related to other Papuan languages in the large islands of Papua New Guinea like New Britain and new Ireland and some languages further south in the Solomon Islands, but this is yet to demonstrated conclusively.

Structural Characteristics

With such a large number of languages and language families it is somewhat difficult to generalize conclusively about the structural characteristics of Papuan languages, but nonetheless some widespread properties can be identified. Phonologically, Papuan languages tend to be simple. The standard system of five phonemic vowels (/i/, /e/, /a/, /o/, /u/) is quite common, although other systems do exist. No Papuan language has yet been found to have more than ten vowel phonemes. Many Papuan languages, especially those of the Sepik-Ramu basin area, have unusual vowel systems, with a very high preponderance of central vowels. The consonantal systems also tend toward the simple. Usually, there are only three places of articulation for consonants, at the lips (bilabial), at the back of the teeth (dental) and the back of the roof of the mouth (velar), although some add a fourth, palatal (the high roof of the mouth). Most languages distinguish at least two types of consonants, an oral one (e.g. /p/) and a nasal (e.g. /m/) but in at least some languages of the Lakes Plain family, nasal consonants may be lacking entirely, an extremely rare occurrence among the languages of the world. Many languages further distinguish two types of stops, a voiceless one with no vibration of the vocal cords (e.g. /p/) from a voiced one with such vibration (e.g. /b/), but this is by no means universal and further, if present, the voiced stop is often preceded by a nasal onset (e.g. /mb/). Continuant sounds, like /f/, /s/ tend to be restricted in Papuan languages. Some languages lack them entirely; others only have /s/; while still others have a restricted set, /f/, /s/ and maybe one other. No Papuan language comes close to English with its eight continuant sounds. Also Papuan languages normally lack a distinction between /l/ and /r/; commonly one sound varies freely between these two articulations with no contrast in meaning. Finally, some Papuan languages make use of tones, the distinctive use of pitch to distinguish words, as in Chinese or the languages of Southeast Asia. This is sporadically found throughout New Guinea, in the Eastern Highlands family, the Sko family, the Lakes Plain family, etc. So, in Obokuitai of the Lakes Plain family, /ti/ with a high pitch means 'string bag', while with a falling pitch, it means 'a type of butterfly'; /di/ with a high pitch means 'red', while with a low pitch, it means 'you'. The form /ku/ is distinctive with all three pitches: with high pitch, it means 'cassowary', with low pitch, 'wood' and with falling pitch, 'a kind of soil'.


Villagers find a dead crocodile by the riverbank.

The structure of words in Papuan languages exhibits great variation in complexity. In some languages, there is little or no inflection, but in others the inflectional possibilities can be extremely complex. It is generally the case that verbs are inflectionally richer than nouns, often astoundingly so. The only widespread inflectional categories with nouns are case, and in many languages of diverse families, gender, the number of genders ranging from two to a dozen or more, as in the languages of the Torricelli and Lower Sepik-Ramu families, in which gender is further unusually characterized by being determined for most nouns by their phonological properties. Languages with noun gender typically require all modifiers of the noun to take proper inflectional affixes to agree in gender with the noun. Yimas of the Lower Sepik-Ramu family illustrates this well, with a pattern much like that of the Bantu languages of Africa. In the sentence 'I saw these my two big ', if the noun is patntrm 'betelnuts', the sentence will be patntrm tma-k ama-na-ntrm kpa-ntrm tm-pal tma-ka-tay, but if kakl 'shells', it will be kakl kla-k ama-na-kl kpa-kl k-rpal kla-ka-tay and if tanpl 'bones', tanpl pla-k ama-na-mpl kpa-mpl p-rpal pla-ka-tay, and finally if irwawl 'mats', irwawl ula-k ama-na-wl kpa-wl wu-rpal ula-ka-tay.

The minimal inflection typically found on verbs is an affix, normally a suffix for tense, for example, in Watam of the Lower Sepik-Ramu family neg-rin give-PAST 'gave' (the hyphen separates the components that make up the word; the base of the word or root is neg- 'give', and the affix -rin is the PAST tense inflection). But verbs can be much more complex than that. For example, they generally mark the person and number of their subject and commonly their object, and occasionally their indirect object with affixes, for example, Fore of the Trans New Guinea family, Eastern Highlands subgroup na-ka-i-e me-see-he-statement 'he sees me' (marking the object with a prefix and the subject with a suffix is diagnostic of Trans New Guinea languages) or Yimas of the Lower Sepik-Ramu family na-n-a-ntuk-mpun it-he-give-PAST-them 'he gave it to them'.

In some languages the person and the number of the subject and object are expressed with different affixes, as in Nimboran of the family of the same name gua-i-b-am live-PLURAL-above-he 'they live above' (a plurality of 'he' is 'they'). But verbs in Papuan languages can be still more complex; for example, they can commonly indicate aspect inflectionally, i.e whether the action is completed or ongoing, as in Marind of the family of the same name, epa-no-kiparud ongoing-I-tie 'I am tying' versus menda-no-kiparud complete-I-tie 'I have tied'; or mood, whether the action is really likely or simply possible, illustrated by Dani of the language family of the same name, wathi 'I killed him', wasik 'I will likely kill him', wale 'I may possibly kill him'. Finally, the verbs of Papuan languages can be very rich, incorporating information about the location or direction of the action and even its temporal coordinates, as in Alamblak of the Sepik Hill family mi-bri-r down-go away-he 'he went down' or Yimas of the Lower Sepik-Ramu family na-pay-wi-ka-pu-kia-ntut he-first-up-go-away-at night-PAST 'he first went away up at night', or indicating extra participants like the ultimate causer of the action or its beneficiary, illustrated by Kewa of the Engan family ma-piraa-ru cause-sit-I PAST 'I made someone sit' or Tairora of the Trans-New Guinea family, Eastern Highlands subgroup rumpa-ti-mi-te-ro tie-me-for-completed-he PAST 'He tied it for me'.

The word order of sentences in Papuan Languages tends to be varied. The languages of the Trans New Guinea family typically have the order subject-object-verb (SOV), as do perhaps the majority of other Papuan language families, but in some languages the order is really rather free. These languages are, however, characterized by postpositions (e.g. town in) rather than preposition (e.g in town), typically diagnostic of SOV languages. However, both the Torricelli and East and West Bird's Head families are exceptions to this generalization; they have the order subject-verb-object (SVO) and prepositions in the case of the Bird's Head families (this may be due to Austronesian influence, but this is highly unlikely for the Torricelli family). Papuan languages are commonly characterized by chains consisting of a series of juxtaposed verbs, as in this example from Kalam of the Trans New Guinea family Madang subgroup mon pk d ap ay 'wood hit hold come put' describing the gathering of firewood.

Some Papuan languages have a very small inventory of verb roots so that verb series like this are necessary to provide full explicit descriptions. Kalam is exemplary in this regard; with less than 100 verbs of largely general meanings, such serial structures are necessary: n means 'perceive' so wdn n eye perceive means 'see', gos n thought perceive means 'think' wsn n sleep perceive means 'dream' and nb n eat perceive means 'taste'. Many Papuan languages also make a formal distinction among the verbs of these verb chaining structures, distinguishing a inflectionally simpler, formally stripped down dependent verb from an independent verb of full inflectional possibilities. Normally the dependent verb precedes the independent one and is dependent on it for specification of many inflectional categories, like tense, mood and often person and number of their shared subject. Iatmul of the Middle Sepik family illustrates this well: v-laa y-ky-nt 'having seen it, then he will come' the first verb (v- 'see') is dependent and the second (y-> 'come'), independent. The latter is fully inflected for tense with the future suffix ky and the subject -nt 'he'; both of these suffixes are lacking and the dependent verb, which takes the sole suffix -laa, marking it as a dependent form and also that the action of the verb it is suffixed to precedes that of the next verb: 'having…, then…'. The subjects of the dependent verb must be the same as that of the next verb in order for a dependent verb to be used in Iatmul: if this is not the case two independent verbs must be used: v-nt maa ynti 'he saw and he (someone else) came'. But this restriction does not apply to all Papuan languages; many Papuan languages, and especially those of the Trans New Guinea family, have special dependent verb forms, so-called 'switch reference' forms, which indicate whether the subject remains the same for the next verb or is different. Barai of the Koiarian family is a is a good exemplar, bu ive i-na vua kuae they food eat-DEP SAME SUBJ talk say 'they ate and then told stories' (same subject for both, so the suffix -na is found on the dependent verb, 'they' are both eating and talking) versus bu ire i-mo no vua kuae they food eat-DEP DIFFERENT SUBJ we talk say 'they ate and then we told stories' (subject of eating is 'they', but of talking is 'we', so the subjects are different and the suffix -mo is required on the dependent verb).

Webpage created by Voxcomm, a 2005 Arts Informatics Project. All information, images and media copyright of the Linguistics Department, USYD, 2005.