« Events in March and April | Blog home | Sound files for Sustainable Data from Digital Fieldwork conference »

business learning training articles new learning business training opportunities finance learning training deposit money learning making training art loan learning training deposits make learning your training home good income learning outcome training issue medicine learning training drugs market learning money training trends self learning roof training repairing market learning training online secure skin learning training tools wedding learning training jewellery newspaper learning for training magazine geo learning training places business learning training design Car learning and training Jips production learning training business ladies learning cosmetics training sector sport learning and training fat burn vat learning insurance training price fitness learning training program furniture learning at training home which learning insurance training firms new learning devoloping training technology healthy learning training nutrition dress learning training up company learning training income insurance learning and training life dream learning training home create learning new training business individual learning loan training form cooking learning training ingredients which learning firms training is good choosing learning most training efficient business comment learning on training goods technology learning training business secret learning of training business company learning training redirects credits learning in training business guide learning for training business cheap learning insurance training tips selling learning training abroad protein learning training diets improve learning your training home security learning training importance

All over Australia now people are writing reports on the progress of their grants - to attach to their begging-letters for more grants. Reading the reports gives you the sense that Australia is a garden of projects, each a mass of bright blossoms fragrant with success. (So why haven't we solved world poverty or climate change yet?) That's why it was really really good to go along to the ARC E-Research post-funding workshop (14-15 February), where participants were encouraged to report on the problems they encountered in their projects...

E-research is about using new technologies to create new kinds of data and to visualise data in new ways. The workshop brought together a range of humanities and science projects, based on their common use of large data stores, high speed grids, and similar computing issues. The format was speed-posters with four minute presentations. It was held in the Members' Dining Room in Old Parliament House, an elegant but wireless-less room. So participants had to listen or make notes on their nice new notepads (branded with the name of the caterer, source of rather good small eclairs), rather than sink into their e-mail. Project spruikers couldn't use their allocated four minutes to show us their stuff - a pain for projects whose goal was to increase connectivity by web access, and for anyone who wanted to click on the URLs of the projects we were hearing about.

Two projects of relevance to linguists involved annotating language data, the Ethno-ER project (partly sponsored by PARADISEC) which has produced EOPAS, a tool for online representation of interlinear text linked to audio and video, which Nick Thieberger talked about, and DADA-HCS, which Steve Cassidy and Roland Goecke presented, and which might provide a back-end for EOPAS.

It was striking how many concerns were shared by the projects (science, medicine, humanities), and how many of their concerns are familiar to people trying to manage linguistic fieldwork data.

Take the International data exchange for global gravitational wave astronomy. They want to share data internationally - just as digital archives of language material do. But..

Data sharing.. is all special cases, both bureaucratically and technologically. Just tracking software and configurations is a full-time job.

And take the MRI Data and computational facility people. They want to develop a work flow for acquiring data, providing meta-data and distributing MRI scans. Just like the kind of work-flow PARADISEC has been developing. But the presenter said that they'd realised that "Our data wasn't well enough organised to take advantage of" - so they hooked up with a commercial digital assets management company to help them. PARADISEC and Ethno-ER have managed to collaborate with researchers in universities and CSIRO to get some of the programming work done, but DADA-HCS noted the problem in competing with industry for recruiting good programmers.

Something which sent a shudder of guilty recognition through me came from a consortium of experimental protein folders. They want to exchange complex data on protein folding which requires accurate metadata, but

Users are too lazy to deposit their data. The simplest UI (User Interface?) is often too perplexing for the average biologist!

This was echoed by a project to develop web-surveying for epidemiological research: "If they can break it they will", and, more politely on their powerpoint: 'Participant proofing' is necessary for survey administration and getting clean data.

The Earthbyte people (makers of fine time-maps showing plate tectonics) emphasised how much longer it takes to develop software and infrastructures that conform to standards, rather than what they called "hero code" (which I think meant once-off stuff). Again, familiar to followers of PARADISEC.

A CI on another brain scan project that led to the development of the Australian Schizophrenia Research Bank was talking about how to cope with lots of little collections of brain scans and "aggressive host firewalls" to create a repository. Again, familiar to language data archives.

The complications of bureaucracy and collaboration were mentioned by the iDIG CI Howard Morphy (Indigenous collections and knowledge archives research network) - they had a prototype web portal for museums to make their data accessible to Indigenous communities. Another achievement was getting an MOU between the collaborators to allow the material to be accessed. It mightn't seem much, but in these days of brutal contracts, expensive insurance and paranoia over worst case scenarios, it really is an achievement. (Ethno-ER suffered from a fifteen month delay in getting its MOU signed, despite much prodding).

Many of the projects involved the creation of repositories. Thus the Australian Schizophrenia Research Bank hopes to be "the world's biggest online mental health research facility", and the other repositories of scientific data look pretty large too. One problem I bet they share with language data archives is the need for an eternity pill. That is, guaranteed long-term funding to maintain the repository.

There's not much hope of the eternity pill until governments and grant agencies and publishers require supporting data-sets to be lodged in public repositories. And even then, it will take a while for those public repositories to get the funding to maintain the data properly. In the meantime I fear that all too many important data-sets will disappear into black holes of old technology.

Back to begging-letters..


I have a comment on the not sharing metadata because the interface is too hard to understand. In my experience the interfaces that are designed for accurate and standard data entry are often incredibly time consuming. Drop-down menus in particular are the bane of my existence...

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Enter the code shown below before pressing post

The Authors

About the Blog

The Transient Building, symbolising the impermanence of language, houses both the Linguistics Department at Sydney University and PARADISEC, a digital archive for endangered Pacific languages and music.


Papua New Guinea FAQs from Eva Lindstrom Papua New Guinea (New Ireland): Eva Lindstrom's tips for fieldworkers

Australian Languages Answers to some frequently asked questions about Australian languages

Papua Web Information network on Papua, Indonesia (formerly Irian Jaya)

Hibernating blogs

Indigenous Language SPEAK

Langguj gel Australian linguistics and fieldwork blog

Interesting Blogs

Omniglot Writing systems and languages of the world

LingFormant Linguistics news

Language hat Linguistics news and commentary

Jabal al-Lughat Linguistics news and commentary on a range of languages

Living languages Blog with news items and discussion of endangered languages

OzPapersOnline Notices of recent work on the Indigenous languages of Australia

That Munanga linguist Community linguist blog

Anggarrgoon Claire Bowern's linguistics and fieldwork blog

Savage Minds A group blog on Anthropology

Fully (sic)

Language on the Move Intercultural communication and multilingualism

Talking Alaska: Reflections on the native languages of Alaska

Culture matters: applying anthropology Australian anthropology blog: postgraduates and staff

Long Road ethnography and anthropology blog - including about Australia

matjjin-nehen Blog on Australian linguistics, fieldwork, politics and the environment.

Language Log Group blog on language and linguistics


E-MELD The E-MELD School of Best Practices in Digital Language Documentation

Tema Modersmål Website in Swedish with links to sites on and in many languages

Hans Rausing Endangered Languages Project: Language Documentation: What is it? Information on equipment, formats, and archiving, and examples of documentation

Indigenous Peoples Issues & Resources a worldwide network of organizations, academics, activists, indigenous groups, and others representing indigenous and tribal peoples

Technorati Profile

Technology-enhanced language revitalization Include ILAT (Indigenous Languages and Technology) discussion list.

Endangered languages of Indigenous Peoples of Siberia

Koryak Net Information on the people of Kamchatka

Linguistic fieldwork preparation: a guide for field linguists syllabi, funding, technology, ethics, readings, bibliography

On-line resources for endangered languages

Papua New Guinea Language Resources Phonologies, grammars, dictionaries, literacy, language maps for many PNG languages

Resource network for linguistic diversity Networking practitioners working to record,retrieve & reintroduce endangered languages


ACLA child language acquisition in three Australian Aboriginal communities

DELAMAN The Digital Endangered Languages and Musics Archives Network

PARADISEC The Pacific And Regional Archive for Digital Sources in Endangered Cultures

Murriny-Patha Song Project Documenting the language and music of public songs and dances composed and performed by Murriny Patha-speaking people

PFED The Project for Free Electronic Dictionaries

DOBES Endangered language documentation and archiving, funded by the Volkswagen Foundation and sponsored by the Max Planck Institute, Nijmegen.

DELP Documenting endangered languages at the University of Sydney

Ethno EResearch Exploring methods and technology for streaming media and interlinear text