« Directions in Oceanic Research - Call for papers | Main | There's copying, and there's research - Peter K. Austin »

[ from Peter K. Austin, Linguistics Department, SOAS]

Today I have a story to share that involves intellectual property violations, taking materials without attribution from a copyrighted dictionary of an Australian indigenous language, and publication of a book that contains such bad scholarship, ridiculous claims, nonsense, and stupid howlers that it is actually funny.

Over the past couple of years I have presented sessions at various workshops and training courses (most recently at a grantee training workshop held at SOAS 11-17th June) on the topics of "ethics, intellectual property rights and copyright". I have learnt a bit about copyright and moral rights in the process - my Powerpoint slides for the most recent presentation can be found here.

One of the issues that is often raised by fieldworkers and researchers during these presentations can be summarised as: "I don't want to make my data publicly available because someone will steal it and publish it under their own name". I usually reply in terms of the low likelihood of such an event happening (as Andrew Garrett said at an archiving workshop at the January 2008 Linguistic Society of America annual meeting (and I paraphrase): "Sorry to tell you this, but actually no-one wants to steal your data") and the protection afforded by copyright and moral rights (mentioning the World Intellectual Property Organisation and various other lobby groups).

Well, unfortunately, I have to change my tune, folks, because it has happened to me. A subset of materials which I have published in book form (and deposited as Word .doc files with the ASEDA archive) and co-published with David Nathan on the web as the Kamilaroi/Gamilaraay Web Dictionary that are all clearly marked as copyright have been reproduced without attribution or recognition of our authorship both on a website and in a recent book publication. Fortunately, they have been done in such as way as to reveal the ignorance of the violator that is truly laughable. Sadly, this individual is attempting to profit financially from both our intellectual property and that of an Australian Aboriginal group, along with potentially damaging the trust we have built up by years of work with the community.

The story goes like this.

Professor Phillip M Parker PhD, Professor of Marketing at INSEAD "The Business School for the World" (based in Fontainebleau, France) has established a website called Webster's Online Dictionary, using the term "Webster's" which is now out of copyright. Note that this has nothing to do with Merriam-Webster, a highly reputable dictionary making firm.

The blurb about Parker's website goes as follows:

"The goal of Websters Online Dictionary is to give all people of the world free access to a complete mapping of all known words to and from all written languages. In fulfilment of this goal, Websters Online Dictionary also offers as much information as possible for each word, including definitions, translations, images, trade name usage, quotations, and encyclopedic knowledge."

On the website it is possible to access wordlists in a wide variety of languages (see Note 2).

In addition to the web materials, Parker also publishes books which are marketed via Amazon and Target (in the US) and which purport to be "thesaurus dictionaries" of a range of languages. David Nash, who brought Parker's website to my attention, pointed out to me that one of these languages is Kamilaroi (more commonly and correctly known as Gamilaraay), a language traditionally spoken in north-west New South Wales that I have been working on since 1972 (an article I wrote on the history of research on Gamilaraay appears in the book by Bill McGregor that I blogged about recently).

So, I thought, how interesting to discover that there is a new Kamilaroi English Thesaurus Dictionary that has just been published, and curious to see its contents, I duly placed an order with Amazon.co.uk, paid UK£17.38 (that's roughly A$35) and waited for two weeks for the book to arrive.

copyright.jpg

And what an unusual and funny little piece of work it is. For my A$35 I got a little A5 book of 65 pages that contains a Preface, Kamilaroi to English Thesaurus, Index of English Subjects to Kamilaroi Subjects, Vocabulary Study Lists, and an Index. Nowhere is the identity or location of the Kamilaroi language given, nor the fact that it belongs to the Gamilaraay people of northern New South Wales. There is nothing about the spelling of Kamilaroi words, though the spelling system follows exactly that established in my publications.

The Preface begins with a truly silly statement (probably inserted from some boilerplate):

This is an English thesaurus designed for Kamilaroi speakers who wish to better understand the ambiguities and richness of the English language"

Sorry, mate, but every Gamilaraay person is perfectly at home in English and doesn't need your little book to help them, certainly not at A$35 a toss, and not based on "444 Kamilaroi subject words".

The "Kamilaroi to English Thesaurus" is a list of 444 Gamilaraay headwords followed by a single word English gloss, or in some instances numbered multiple glosses. This information is taken straight from the web dictionary (or from my Gamilaraay Reference Dictionary published in 1993). This looks like stealing, you might think, but under a strict interpretation of copyright (which Prof Parker has no doubt checked with a lawyer) this is not "creative work". It is not possible to copyright common knowledge such as words and meanings. Unfortunately for Parker, some of the quoted forms, like muRumuRu on page 11 are creative works since they are reconstitutions which I have posited on the basis of 19th century published and unpublished amateur recordings (as explained in the preface of my dictionaries -- note that the orthographic R is not a Gamilaraay sound but a cover term for where I could not determine whether the source represented a flap rr or a continuant r). Now that is copying of creative work without attribution, in my view.

Following the gloss there is a string of "synonyms" and, for some entries, "antonyms" which have apparently been computer generated by a program that is seeded by the gloss. This is where the author's bad scholarship truly comes into play. Here is the entry for bindaya a word borrowed into Australian English as "bindieye" but it, and the word 'burr' in the sense of a prickly weed, is clearly missing from Professor Parker's computer program lexicon:

bindaya burr; synonyms (n) flash, beard, brogue, burring, bramble, drawl, enunciation, inflection, intonation, pronunciation, twang, clinker (v) bur, clank, clink, jangle, pipe, creak, grate, jar, snub (adj) awn, catchweed, cleavers, clivers, goose, grass, hackle, hairif, hatchel"

Look out for those slurred bindieyes when you open your mouth next!

The remaining 443 entries are more-or-less of this type - a useless list of English words that bear little or no connection to the meaning of the Gamilaraay headword and that do not respect its morpho-syntactic category (headwords that contain a hyphen plus conjugation maker, like baaya-li 'to chop' are verbs in Gamilaraay but they still get noun glosses in this book). I was interested and surprised to learn that there are aardvarks and pangolins in Australia (listed as synonyms in the entry for bigibila 'echidna'). And so it goes for 19 pages.

The following "Index of English Subjects and Kamilaroi Subjects" is simply a listing of the English glosses with Gamilaraay headword repeated from the first 19 pages. Again this material is taken from my publications. The "Vocabulary Study Lists" are listings of Gamilaraay headwords with English gloss classified into groups as "Verbs", "Nouns" and "Adjectives" (again done with poor scholarship -- 'black swan' baRamal (a reconstitution, again) is listed under "Adjectives"). The book ends with a 34 page index giving page references for the English words that appear in all the preceding sections.

When I started reading this sad little book I felt angry, but when I got to the end of it I realised it is so badly done that I had to laugh.

So where does this leave us in relation to copyright and publication without attribution? I believe that what Parker has done is an aberration and that true scholars do not disrespect other scholars' work in this way (nor disrespect the communities whose languages we seek to document). Also, by making the Gamilaraay dictionary available on-line and depositing the materials with an archive, we are able to demonstrate clearly that our work has been made use of, and intellectual property rights violated. So my advice is do go ahead and publish your work -- that's how you establish copyright and get to assert moral rights. If or when they are abused then point this out publicly, as I have done here.

Oh, and don't waste your money buying copies of Parker's terrible book. It just encourages him.



Note 1: Thanks to David Nash for telling me about Parker's website and books and for corrections and comments on an earlier version of this posting. Thanks also to David Nathan both for digging around inside Parker's website and for correcting and commenting on an earlier draft of this blog. Neither David can be held responsible for the opinions expressed here.

Note 2: The terms of use state (rather ironically as it turns out):

"Students: If you are a student and want to use some of the sites content for a classroom assignment please feel free to cut and paste sections off of the web site, and paste them directly into your document. All of the pages are Microsoft Word compatible. Remember to include the "Source" at the bottom of the tables so your teachers cannot accuse you of plagiarism."


On this page we find:

"Copyright Notice: This site and its contents are Copyright 2004, Philip M. Parker and Webster's Online Dictionary (websters-online-dictionary.org). All rights reserved. All contents cited with permission, with license, or quoted under fair use doctrine remain the intellectual property of their respective originators. Other contents are in the public domain, and are used with attribution."


Pity that what is good for the goose isn't so good for the gander!

Comments

Parker has also published Webster's Kamilaroi to English Crossword Puzzles: Level 1 and Webster's English to Kamilaroi Crossword Puzzles: Level 1, each costing A$15. I have not wasted my money buying these two books, but I guess the content is also taken from the Gamilaraay Web Dictionary.

I stand corrected, Peter!

Actually I still believe my main point — sadly few linguists are interested in using, let alone misusing, unpublished language data deposited in archives — but this amazing case does highlight a danger of publication, probably especially web publication.

It's hard to believe that the guy can't be taken to task somehow for something. According to Amazon he has 85,764 books (that's right, eighty-five thousand and change). The many language books are bad enough, including not just dictionaries but also items like the "Kamilaroi" crossword puzzle book, but the long series of medical books could surely actually be dangerous to someone's health.

All but 3 of his 85,764 books are the product of data harvesting and automatic book publishing. I've done a bit of background research on this guy and plan to blog about it when I get the time to finalise the post (ALS is taking its toll).

I have bought each of the Wageman [sic] / English crossword books, and despite there being absolutely no information about the language, the speakers, or previous research or anything, they're not such a bad resource for teaching kids. Although, garbage in, garbage out; if the wordlist that he harvests isn't ready for publication, the result could be a sub-standard orthography, incorrectly glossed words, whatever.

While it's a risk to my own legal position, I'll post a scanned pdf of one of the pages.

Peter is there any lobbying action you could suggest for us concerned ppl to take? Even just an email to him to tell him off?

Wamut - some people have tried this already. Stephen Wilson, whose Wagiman Online Dictionary was the source for materials that Parker published as a "Wageman" print dictionary and crossword puzzles wrote to him and got the following response:

From what I have been told, no infringement can exist in mapping 2 words to each other. Courts have found that single word translations are considered discoveries, knowledge and/or are factual, and therefore do not benefit from copyrights (your web page presentations, layout, and original analysis can have copyright). Original or creative definitions, on the other hand, can carry copyright; I am careful never to use original definitions without permissions. Your site does not create a mapping from Wageman to an English thesaurus (i.e. I map Wageman words to English words not even contained on your site), nor does your site have crossword puzzles, which map Wageman to English via grids. Flashcards, language games, video learning tools, etc., similarly would not infringe. A web site can have a copyright, but the knowledge contained within it may not. This is what I have been told.

As I noted in my posting, Parker is adopting a strictly legal interpretation of copyright that enables him to view his work as not violating the law regarding copyright materials. So you could write to him to complain but I personally doubt that he will take much notice. It seems that he's got a little cottage industry going and won't be diverted by linguists pointing out the error of his ways, but I could be wrong.

I don't think that this counterexemplifies Andrew's point at all: this guy is not a scholar, he's a professor of marketing. That's like being an instructor in pimping, but with better liquor.

With regard to the question of whether the inclusion of reconstructions makes a difference legally, I am not so sure. The basic legal principle involved is that information cannot be copyrighted, only the expression of that information. The reason that you can publish your own version of the telephone directory, for example,is not that no creativity goes into constructing the numbers and pairing them with names and addresses. Rather, it is that the numbers, names, and addresses are uncopyrightable information, so a publication of them may only be copyrighted insofar as it presents them in a sufficiently creative way. If, for example, you were to publish a telephone directory in which each entry consisted of a limerick, it would be copyrightable due to the creativity of the limericks.

So my inclination is to say that due to the content/expression dichotomy in copyright law, the fact that some of the entries in your dictionary are your reconstructions rather than directly recorded words makes no legal difference. However, there are some decisions about "fictional facts" that might support the opposite conclusion. If you really want to know, short of a decision by a court you need a true copyright guru to look into it.

Bill - the only way really to test whether reconstructions (or reconstitutions based on early wordlists) are considered to be "creative works" is in court, I suspect, and I have neither the funds nor the time to do that. Probably all we can do is yell about how 'unacademic' and 'non-collegial' Prof Parker is.

Would you be willing to publicise Prof Parker's exploits on Language Log which would reach a wider audience than this blog or Matjjin-nehen? We know that materials from various Australian languages have been plagiarised but it may be that other linguists could find their e-published data has been vacuumed up by Prof Parker as well. It may be an issue of interest to the wider linguist audience.

Another could tack would be to tell Amazon they are selling bad products. If someone has time to do up a standard email or letter, I'd be happy to send them a copy from me and I'm sure plenty of others will do the same.

Parker did the same thing to our online data for the Cheyenne language. He even included words from another language, Blackfoot, which are not in our online database. See my amazon.com reviews:

http://www.amazon.com/Websters-Cheyenne-English-Thesaurus-Dictionary/dp/0497834677

http://www.amazon.com/Websters-Cheyenne-English-Crossword-Puzzles/dp/0497826089/ref=sr_1_1?ie=UTF8&s=books&qid=1217128806&sr=1-1

I have informed a tribal linguist and suggested that the tribal attorney might want to get involved with the copyright issues. The tribal college owns the copyright to the website as well as a new dictionary which we are readying for publication.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Enter the code shown below before pressing post

The Authors

About the Blog

The Transient Building, symbolising the impermanence of language, houses both the Linguistics Department at Sydney University and PARADISEC, a digital archive for endangered Pacific languages and music.
More

FAQ

Papua New Guinea FAQs from Eva Lindstrom Papua New Guinea (New Ireland): Eva Lindstrom's tips for fieldworkers

Australian Languages Answers to some frequently asked questions about Australian languages

Papua Web Information network on Papua, Indonesia (formerly Irian Jaya)

Interesting Blogs

Omniglot Writing systems and languages of the world

LingFormant Linguistics news

Language hat Linguistics news and commentary

Jabal al-Lughat Linguistics news and commentary on a range of languages

Kiangardarup Indigenous concerns in south-west Western Australia

Living languages Blog with news items and discussion of endangered languages

OzPapersOnline Notices of recent work on the Indigenous languages of Australia

That Munanga linguist Community linguist blog

Langguj gel Australian postgraduate linguistics and fieldwork blog

Anggarrgoon Claire Bowern's linguistics and fieldwork blog

Savage Minds A group blog on Anthropology

Language Log Group blog on language and linguistics

Arwarbukarl Indigenous Language and Information Technology Blog

Culture matters: applying anthropology Australian anthropology blog: postgraduates and staff

Indigenous Language SPEAK A forum for linguists, language speakers, educators and any other interested people to discuss any issues regarding language loss, language research, and fieldwork methodology within indigenous communities.

Long Road ethnography and anthropology blog - including about Australia

matjjin-nehen A student blog of linguistics, politics and the environment.

Links

E-MELD The E-MELD School of Best Practices in Digital Language Documentation

Tema Modersmål Website in Swedish with links to sites on and in many languages

Hans Rausing Endangered Languages Project: Language Documentation: What is it? Information on equipment, formats, and archiving, and examples of documentation

Koryak Net Information on the people of Kamchatka

Linguistic fieldwork preparation: a guide for field linguists syllabi, funding, technology, ethics, readings, bibliography

On-line resources for endangered languages

Papua New Guinea Language Resources Phonologies, grammars, dictionaries, literacy, language maps for many PNG languages

Projects

ACLA child language acquisition in three Australian Aboriginal communities

Resource network for linguistic diversity Networking practitioners working to record,retrieve & reintroduce endangered languages

DELAMAN The Digital Endangered Languages and Musics Archives Network

PARADISEC The Pacific And Regional Archive for Digital Sources in Endangered Cultures

Ethno EResearch Exploring methods and technology for collaborative electronic research

Murriny-Patha Song Project Documenting the language and music of public songs and dances composed and performed by Murriny Patha-speaking people

DOBES Endangered language documentation and archiving, funded by the Volkswagen Foundation and sponsored by the Max Planck Institute, Nijmegen.

DELP Documenting endangered languages at the University of Sydney

Powered by
Movable Type 3.2