Main

[From our man in Hawai'i and Melbourne - Nick Thieberger]

The Australian government has millions of dollars that it will be spending on what it calls the National Collaborative Research Infrastructure Strategy (NCRIS) to support new technologies in research in Australia.

"Through NCRIS, the Government is providing $542 million over 2005-2011 to provide researchers with major research facilities, supporting infrastructure and networks necessary for world-class research."

DEST released a paper outlining what it called 'capabilities' which it proposed to fund, and they were ALL in the sciences, including lots of shiny pointy instruments (synchrotron, new telescopes and so on) to do the whizzbang experiments that are so popular and capture the imagination of politicians. While the physical science community has amazing capacity to pull in big research dollars, there are not that many of them, and even fewer who actually want to use each of these very expensive instruments.

On the other hand, the Humanities, Arts and Social Science (HASS) community is huge, and also does the kind of work that, in the main, is immediately relevant to those who fund it (taxpayers). So, in the consultation that followed, the clamour of HASS proponents resulted in a new 'capability' being added to the 'roadmap', but without any funding (yet) associated with it. There will be an 'Innovation White Paper' announcement before the end of 2008, and the current roadmap leads to the White Paper.

All of this is important for us, as it is the bucket from which national infrastructure like a National Data Service may be funded, and where policies on standards for data repositories like PARADISEC will be set. It is where funding will come from for the national computer facility that houses the online version of the PARADISEC collection.

More...

2000 Hours

30 June, 2008

Early this morning, a delivery of audio files was quietly sent from Paradisec's local server at the University of Sydney to permanent near-line tape storage at the Australian Partnership for Advanced Computing in Canberra. This happens on many days, as you might imagine, but what makes today's delivery special, was that somewhere in that bunch of files was our 2000th archived hour of audio.

Moreover, we will soon be celebrating five years of operations, in which case, 2000 hours might not seem so impressive - it's just 400 hours per year after all - but we at Paradisec are very proud of our collection. Especially given that just about everything here is done on a shoestring budget and there have been some lengthy hiatuses of funding lately.

Speaking of which, this may be an opportune time to mention that we are always amenable to generous donations from people wishing to sponsor the digitisation and preservation of a collection of data. See our website for more details.

So, just which file was the lucky 2000th hour? Well, we can't really be sure, but we do know that it was among a collection of Mark Durie's research into the dialects of Aceh, an area that was devastated by the Indian Ocean tsunami of Boxing Day 2006.

To help us celebrate both these milestones, Mark has kindly written a small piece for us about Aceh's dialects, his research of them and the importance of preserving the collection. He has also allowed a small portion of one of these recordings to be posted with this piece, which you can download here.

More...

A lot of work has been happening at the University of Sydney over the past six months, and at the end of last year the top floor of the Transient Building, which houses Linguistics, Paradisec and a few other offices, got renovated. Unfortunately, since the entire exterior of the building is composed of fibrous asbestos, it's unlikely that the University will outlay the mammoth insurance costs to do any exterior work. But anyone who knows the Transient building knows that the best option would be to demolish the whole thing and start again from scratch.

More...

After some effort PARADISEC has finally established a streaming server that can be used in normal web pages. This means that an online dictionary, for example, can have example headwords and sentences spoken, or video clips presented to illustrate a given word. You can see the trial version here, (NB this will only work with the Firefox browser and you will also need to pre-install the Annodex plugin).

For some time it has been troubling that we have no simple way of presenting media online in association with transcripts, especially when an archived field recording may be the only recording of a particular language. It should have been simple enough to access media on the web. After all, we do it on Youtube and other places. But we have been further constrained by really wanting all of this to be open source (freely available software) so that anyone with the right skills can replicate this setup and not have to pay. And we also wanted the process for getting material into an online presentation to follow on from normal fieldwork outcomes, in line with output from the tools typically used by a professional linguist (one who keeps up to date with the methods of their profession). When the archival form of the media exists in a repository, it should then be an automatable process to put it into a streaming server for access.

More...

Yesterday (27 October) was the first celebration of UNESCO's world day of audio-visual heritage. The trailer on that website, put together from the holdings of various audio-visual archives around the world, gives a flavour of the kind of material that is held in audio and film/video archives worldwide. Australia is fortunate to have many cultural institutions that hold and look after material recorded in Australia: the National Film and Sound Archive (NFSA), the Australian Institute of Aboriginal and Torres Straid Islander Studies (AIATSIS), the National Library of Australia (NLA), the National Archives of Australia (NAA) and many others.

More...

Video in fieldwork

1 October, 2007

Check out 'Language Archives Newsletter' (LAN) No. 10 (edited by David Nathan, Marcus Uneson, Paul Trilsbeek). It features articles on the role of video in language documentation by Patrick McConvell and Peter Wittenburg, as well as reviews of audio recorders including the Zoom H4.

LAN 10 Contents:
Video - A Linguist's View (A Reply to David Nathan), by Patrick McConvell
Video - A Technologist's View (A Reply to David Nathan), by Peter Wittenburg
Review: Audio Recorders Zoom H4 and Korg MR-1, by Paul Trilsbeek, Gerd Klaas
Review: Audio Recorder iRiver H320, by Bernard Howard
CLARIN Research Infrastructure Initiative, by Peter Wittenburg
Announcements etc

Go Xena!

24 September, 2007

So you want to preserve that MSWord novel, those spreadsheets, those AppleWorks fieldnotes forever?

The National Archives of Australia are ahead of you - they've developed free and open source software to help in the long term preservation of digital records. Xena! (XML Electronic Normalising for Archives - and I bet they thought hard to come up with the N).

I saw a demo of Xena a couple of years ago, and was greatly impressed by the potential of streamlining the workflow in digital text archives - by detecting the file formats of digital objects, and then converting them into open formats like XML for preservation. Databases remain the nightmare of course.

Anyway, there's a new release - and here are the details.

More...

Digital archives of photos, films and recordings are springing up in Indigenous communities, and some of them are even Getting Funding, hurrah! The Bill and Melinda Gates Foundation is giving a million US dollars to the Northern Territory State Library System:

"a 2007 Access to Learning Award recognizes the Northern Territory Library for providing free computer and Internet access and training to impoverished indigenous communities... The award honours the innovative Libraries and Knowledge Centres (LKC) Program, which provides communities with free access to computers and the Internet, and helps Indigenous Territorians to build digital collections of their culture through the Our Story database."

They've got Knowledge Centres at Milingimbi, Wadeye, Peppimenarti, Umbakumba, Angurugu, Pirlangimpi, Milikapiti, Barunga, Ti Tree, and Ltyentye Apurte.

.....As well, "Microsoft, a Global Libraries initiative partner, will donate US $224,000 in software and technology training curriculum to upgrade the organization’s 300 library computers." [Weep for us Mac users]

The Our Story database is an adaptation of the classic Filemaker Pro Ara Irititja program developed by the artist and historian John Dallwitz for the Anangu Pitjantjatjara.

Ara Irititja, a project of the Pitjantjatjara Council, commenced in 1994 when it was realised that a large amount of archival material about Anangu was not controlled by or accessible to them. This material was held in museums, libraries and private collections. Items held by private individuals were often at risk of being damaged or irretrievably lost. To date, a major focus of Ara Irititja’s work has been retrieving and securing such records for the benefit of Anangu and the broader Australian community.

The great advantage of Filemaker Pro was that it was basically off-the-shelf and basically fairly easy for people to use. There have been elaborate proposals, but going beyond glamour to making things work in remote communities is a very large step.

More...

[ from Nick Thieberger, PARADISEC, Melbourne University branch ]

I am a firm believer in open access to information, especially research information that has been created by taxpayers' funds. Thus it came as something of a surprise to find myself likened to the main man of the dark forces of corporate information ownership on a site formerly known as the 'Stolen Grammars' site.

Constructed by a linguist in Stockholm, the site offered downloadable versions of many grammars which had been copied from various locations ("Browse my collection of stolen .pdf reference grammars if you'd rather not pay.")

More...

Last Friday was a bit of a milestone for me, since, in the 6 or so months that I have been involved in the audio preservation side of things at PARADISEC, I hadn't yet actually cleaned a damaged audio tape. Unfortunately for me, the process isn't quite as straight-forward as it is for a CD - warm soapy water, a non-abrasive cloth, wipe across the grain - rather, the entire process can take weeks, depending on how badly affected the tapes are.

More...

[From Peter K. Austin, Endangered Languages Academic Programme, SOAS]

On Wednesday last week (25th April) during Endangered Languages Week at SOAS there was a presentation on the "Dawes online" project at SOAS which aims to make an interactive digital facsimile of William Dawes' notebooks of the Sydney language available on the web. The project has produced high resolution digital images of the notebooks written by Dawes in 1790 and is developing searchable transcriptions of the manuscripts that will include the linguistic analysis made by Jaky Troy (published in 1993) along with topic maps (using the XTM standard for XML topic maps). This will enable users to search by topic, such as “animals” or “names” as well as linguistic topics, such as verb paradigms.

This project brings together knowledge and skills from archive studies, philology, linguistic analysis, and information and multimedia technologies. It is one of the more technically sophisticated of a series of projects that have emerged over the past several years to work on archival materials of Australian and Pacific languages, especially languages that have no or very few speakers. This work has parallels in the richly elaborated studies of Old English manuscripts published by Bernard Muir of Melbourne University as CDs and DVDs. The goal of both Muir’s work and the Dawes project is to present the original materials in an interactive format along with layers of standoff analytical markup.

A related kind of study is what we could call “second generation language documentation” (2GLD) where it is linguist’s fieldnotes and transcriptions which form the basis for documentation rather than speech events or speaker knowledge (usually because it is no longer possible to access such knowledge or events). Paradisec has photographed over 10,000 pages of fieldnotes on a wide range of languages for 2GLD purposes using the system developed at the Australian Science and Technology Heritage Centre This includes Arthur Capell’s notes on Pacific languages.

More...

All over Australia now people are writing reports on the progress of their grants - to attach to their begging-letters for more grants. Reading the reports gives you the sense that Australia is a garden of projects, each a mass of bright blossoms fragrant with success. (So why haven't we solved world poverty or climate change yet?) That's why it was really really good to go along to the ARC E-Research post-funding workshop (14-15 February), where participants were encouraged to report on the problems they encountered in their projects...

More...

It's Australian grant application time! Joy, rapture! (Skips lightly around the room)
If you're thinking about what to spend your requested squillions on, here are two thoughts:

Archiving
Be realistic about how much it will cost to prepare your recordings for archiving, and then the cost of archiving itself - if you don't have a large friendly archive to hand. PARADISEC gives some guidelines on costs. And Dave Nathan has some shudder-inducing remarks on the current cost of archiving video. [1]

Getting manuscripts ready for publication
Many linguistics publishers do NOTHING about proof-reading or copy-editing your masterpiece. Your baby, you wash the nappies. And non-commercial linguistics publishers that do take copy-editing and proof-reading seriously, like Pacific Linguistics, need all the help you can give them - such as a publication subsidy to defray the costs of copy-editing. So imagine how many hours it might take to copy-edit your dictionary, double it, multiple by a suitable hourly rate - and build it into your application if you can.

[1] Video and Language Documentation: panacea or madness? presented at the DELAMAN IV meeting. 2 November 2006, SOAS.

[ Barry sent this in response to the Artefacts, labels and linguists post. He is the curator responsible for the Pacific Cultures Gallery at the South Australian Museum, and has a brilliant website for an Upper Sepik-Central New Guinea project on the relations between material culture and language, geographical propinquity, population, subsistence and environment.]

The outcome of the renovation of the Pacific Gallery is a compromise between the enormous task of upgrading and relabelling an exhibition of 3000 artefacts and the available funding. A lot of money went into removing the 1960s ceiling, replacing aircon, carpet and lighting, and a paint job. I did not agree with the shiny black but the Goths had the numbers in the committee.

We have begun the difficult task of providing renewed labelling in the wall cases - difficult because one case may have around a hundred items and in such instances we can't provide a label for each individual item - instead we will try to say something about the collectivity of objects that gives the viewer some sense of what they are looking at, in terms of types and geographical distribution (such as in the display of over 80 stone headed clubs). Electronic means of providing information will not be limited in this way and individual items will be provided with full data, including language groups (speech communities) from which the objects have come (where known).

More...

Whether languages can be property has generated further discussion, on Language Log, and on several anthropology blogs (thanks Kimberly!). Two themes emerged: power, and the potential conflict with open access.

More...

The surprise for me from the Sustainable Data from Digital Fieldwork workshop (aka Suzzy Data..) was how much plant taxonomists and field linguists have in common. And how much we need to work together with librarians and archivists. We both have to look after records - the decaying recordings of the languages, and the dried specimens in the herbariums. We both work with the living communities, the trees that will get logged and the communities that live with the trees, and the families and children who will switch to speaking another language.

More...

Dear ELAN Workshop attendees, and anyone who might find this of interest,

There were a few loose ends left at the end of the ELAN workshop last week. I'd particularly like to address one, the question as to whether we should aim for a standard set of ELAN templates which everyone uses.

More...

I wandered into the office today to see Jane and Mark with a large map of part of the northern territory rolled out on the floor, discussing the issue of iso-glosses, and boundaries. Maps maps maps. They're just everywhere at the moment!

More...

Last week, one of my favourite blogs, BoingBoing, had an interesting link to a new web based research tool. I've been having a go over the weekend.

More...

Check out the latest Language Archives Network News [sorry Dave!]newsletter here. It's got helpful information on how the Max Planck Institute (Nijmegen) can help you set up a local archive, a system of cataloguing linguistics information (IMDI) about your recordings, and on getting permanent unique resource identifiers for stuff stored on the web. And it's also got an article on recording information about plants and animals in the field that you might read in conjunction with Tom's post on this topic.

More...

Our December conference is almost full, so if you were thinking of coming along, now is the time to register! The preliminary schedule is up, papers have been reviewed, everything is going along nicely (touch wood).

The third day of the conference is a workshop, with sections on audio and video recording, transcribing and managing your data, and producing outputs from this data. If this is more your thing you can come to just that. If you're interested in ELAN for transcribing or shoebox/toolbox, I thoroughly recommend it, but there'll be plenty of other useful stuff.

The Australian Research Council's website today has survived the pressure of everyone wanting to know whether they've got winning tickets. I was in a few syndicates (PARADISEC, continuing the Aboriginal Child Language Acquisition (ACLA project), and a new project on Indonesian). And the lucky winners are...

More...

The preliminary schedule for the conference "Sustainable data from digital fieldwork: from creation to archive and back" is now up. There looks to be some really interesting projects on display. I had a sneak peek at EOPAS, a project to create a workflow and display interlinearised texts, and annodex, a project to display multiple streams of visual, audio and textual data, both of which look great. I'll also be talking about the FieldHelper tool I've been working on this year, a tool to add in the tagging of arbitrary metadata to field work data, amongst other things.

Our registration quota of 40 places is fast filling up. Please register now if you wish to come, also note that you can choose to come to the third day workshop if your interest in more in practical experience with current digital field work tools.

RNLD in collaboration with the conference "Sustainable data from fieldwork" is offering a day-long session on the creation, organisation, annotation and display of digital media. I highly recommend this to anyone interested in making digital recordings and annotating them. If you're new to shoebox or ELAN and have any questions about using it, and you have your own data, then bring along your laptop. The workshop will be held at Sydney University on Wednesday, December 6, 2006.

Read on for the specifics

More...

Every dead ethnographer (Indigenous or non-Indigenous) had a tin trunk in which all the information on the people, the language, the culture, anything, yes anything you want to know, could be found. But, I'm sorry, aunty died last week, and we don't know WHERE that tin trunk is now. (Source of observation: Michael Walsh). The anthropologist Ursula McConnel who worked with Wik Mungkan people on Cape York Peninsula, died in 1957, and people have been looking for her trunk ever since.

More...

I got inspired to preen our blogroll, by following up blogrolls on other linguistics blogs (notably Language Log). This meant hours of pleasure going through musings, dead blogs, frozen blogs, (very!) personal blogs, e-learning blogs exhorting us to use blogs in teaching, e-learning blogs exhorting not to use them, pictures of cats, gardens, parrots, business blogs, meta-blogs..

The results?

More...

Jane's last post and a post on the ever excellent Language Log have got me thinking about permanence and accountability in the internet age. Its a theme that I encounter again and again, working for a digital archive.

More...

if you want to spend three years thinking and writing about languages and cultures of Australia and the Asia-Pacific region ...
Nod to Ethics committee: HEALTH WARNING: and you're not ESPECIALLY worried about whether you'll find a interesting job afterwards....

... applications for the 2007 APA/UPA scholarships at the University of Sydney are now open. Information and an application can be downloaded from:
http://www.usyd.edu.au/ro/training/postgraduate_awards.shtml

More...

Many academic disciplines depend on analysis of primary data captured during fieldwork. Increasingly, researchers today are using digital methods for the whole life cycle of their primary data, from capture to organisation, submission to a repository or archive, and later access and dissemination in publications, teaching resources and conference presentations. This conference and workshop will showcase a number of projects that have been developing innovative and sustainable ways of managing such data.

More...

Those of you in Canberra this week might be interested in the Australasian Sound Recordings Association annual conference "Listening" to be held 23-24 August at the National Film and Sound Archive. Among the several presentations of likely interest to readers of this blog will be the session on "Listening, Language and Culture" on 23 August, and a highlight will be the the Alice Moyle lecture to be delivered by Gupapungu elder Joe Gumbula (Galiwin'ku Knowledge Centre, Elcho Island) on 24 August at 9.15. See the conference programme for full details!

On Thursday 29 June 2006 I joined heaps of overcoated people in the large, airy Reading Room of the Australian Institute of
Aboriginal and Torres Strait Islander Studies
(AIATSIS) in Canberra. We were celebrating the launch of "Indigitisation" - a three year funded digitisation program for sound, text, film, and photographs. The view of lake, sky and trees and some determined ducks was a distraction from the speeches, but some things stuck - 40,000 hours of sound recordings of Indigenous languages to digitise, lots of expensive machines, some enthusiastic staff, and as yet no off-site backup. Storage problems mean they're not digitising everything at 24-bit, 96 kHz. They're planning to deliver some sound files through the web, where communities have given permission. So in future you should be able to click on some on-line catalogue entries and download sound files.

The AIATSIS Library staff showed "Collectors of words" - a web presentation of the nineteenth century word-lists of Australian languages from E. M. Curr and Victorian and Tasmanian languages from R. B. Brough Smyth . They're available as pdfs, organised alphabetically according to the place the words were attributed to, and linked to maps. A nice feature is the linking to the AIATSIS catalogue, so that you can find other materials referring to the same language group. Unfortunately the pdfs are only images - you can't search for text in them. If you want text copies of Curr, go for the transcribed copies in AIATSIS's electronic text archive ASEDA. These aren't yet linked to the scanned images - a job for the future!

More...

Pacific and Regional Archive for Digital Sources in Endangered Cultures / Sydney Humanities and Social Sciences e-Research Initiative Workshop

Presenters: Dr Linda Barwick, Director, PARADISEC and Frank Davey, Audio Preservation Officer, PARADISEC.
A free workshop covering: the range of research applications for recording and analysis of digital audiovisual media; questions of sustainability and archiving of audiovisual data; tools and resources for archiving, analysis and presentation of digital audio; the role of recordings in humanities disciplines; and using audio recordings in presentations and teaching. Includes hands-on sessions using Audacity sound editing software and Transcriber speech annotation software.

More...

The Authors

About the Blog

The Transient Building, symbolising the impermanence of language, houses both the Linguistics Department at Sydney University and PARADISEC, a digital archive for endangered Pacific languages and music.
More

FAQ

Papua New Guinea FAQs from Eva Lindstrom Papua New Guinea (New Ireland): Eva Lindstrom's tips for fieldworkers

Australian Languages Answers to some frequently asked questions about Australian languages

Papua Web Information network on Papua, Indonesia (formerly Irian Jaya)

Interesting Blogs

Omniglot Writing systems and languages of the world

LingFormant Linguistics news

Language hat Linguistics news and commentary

Jabal al-Lughat Linguistics news and commentary on a range of languages

Kiangardarup Indigenous concerns in south-west Western Australia

Living languages Blog with news items and discussion of endangered languages

OzPapersOnline Notices of recent work on the Indigenous languages of Australia

That Munanga linguist Community linguist blog

Anggarrgoon Claire Bowern's linguistics and fieldwork blog

Savage Minds A group blog on Anthropology

Language Log Group blog on language and linguistics

Arwarbukarl Indigenous Language and Information Technology Blog

Culture matters: applying anthropology Australian anthropology blog: postgraduates and staff

Indigenous Language SPEAK A forum for linguists, language speakers, educators and any other interested people to discuss any issues regarding language loss, language research, and fieldwork methodology within indigenous communities.

Long Road ethnography and anthropology blog - including about Australia

matjjin-nehen Blog on Australian linguistics, fieldwork, politics and the environment.

Langguj gel Australian linguistics and fieldwork blog

Links

E-MELD The E-MELD School of Best Practices in Digital Language Documentation

Tema Modersmål Website in Swedish with links to sites on and in many languages

Hans Rausing Endangered Languages Project: Language Documentation: What is it? Information on equipment, formats, and archiving, and examples of documentation

Technology-enhanced language revitalization Include ILAT (Indigenous Languages and Technology) discussion list.

Endangered languages of Indigenous Peoples of Siberia

Koryak Net Information on the people of Kamchatka

Linguistic fieldwork preparation: a guide for field linguists syllabi, funding, technology, ethics, readings, bibliography

On-line resources for endangered languages

Papua New Guinea Language Resources Phonologies, grammars, dictionaries, literacy, language maps for many PNG languages

Resource network for linguistic diversity Networking practitioners working to record,retrieve & reintroduce endangered languages

Projects

ACLA child language acquisition in three Australian Aboriginal communities

DELAMAN The Digital Endangered Languages and Musics Archives Network

PARADISEC The Pacific And Regional Archive for Digital Sources in Endangered Cultures

Ethno EResearch Exploring methods and technology for collaborative electronic research

Murriny-Patha Song Project Documenting the language and music of public songs and dances composed and performed by Murriny Patha-speaking people

DOBES Endangered language documentation and archiving, funded by the Volkswagen Foundation and sponsored by the Max Planck Institute, Nijmegen.

DELP Documenting endangered languages at the University of Sydney

Powered by
Movable Type 3.2