« What's the default language for an Indigenous writer? | Blog home | Noisy placenames »

After some effort PARADISEC has finally established a streaming server that can be used in normal web pages. This means that an online dictionary, for example, can have example headwords and sentences spoken, or video clips presented to illustrate a given word. You can see the trial version here, (NB this will only work with the Firefox browser and you will also need to pre-install the Annodex plugin).

For some time it has been troubling that we have no simple way of presenting media online in association with transcripts, especially when an archived field recording may be the only recording of a particular language. It should have been simple enough to access media on the web. After all, we do it on Youtube and other places. But we have been further constrained by really wanting all of this to be open source (freely available software) so that anyone with the right skills can replicate this setup and not have to pay. And we also wanted the process for getting material into an online presentation to follow on from normal fieldwork outcomes, in line with output from the tools typically used by a professional linguist (one who keeps up to date with the methods of their profession). When the archival form of the media exists in a repository, it should then be an automatable process to put it into a streaming server for access.

With various partners, PARADISEC has been part of an ARC-funded project called EthnoER, which has developed an online presentation of media and time-aligned transcripts. By uploading a media file and the transcript in Toolbox format we can present interlinear glossed text as seen in EOPAS here (the media needs to be played in Firefox using plugins available here). The enabling technologies are Annodex (developed by CSIRO) and EOPAS (the EthnoER online presentation and annotation system, developed in collaboration with Ronald Schroeter of the University of Queensland).

Once media is available for streaming, it can be called using fairly simple HTML with a javascript as seen in the trial version here. From PARADISEC's perspective, we should be able to use this technology to make archival media available (subject to deposit conditions) via a web browser.

To get to this point we have had to work through a number of issues. The idea is that there could be several Annodex servers, perhaps each associated with a linguistic archiving project. Selected media files are transcoded to one of the open-source OGG formats and then placed into the server (our thanks to Stuart Hungerford and Jonathan McCabe of APAC for their help in setting up the PARADISEC Annodex server), and then become available to be called in the way seen in the trial page. If you look at the HTML coding you will see that there is a javascript that controls how the call is made (thanks to Shane Stephens formerly of CSIRO, and now of Google for his assistance here). To get your data into this form you will need to convert the time formats etc. to match the structure of the demo document. (To get you started, I have written an export routine in Audiamus 2.5 to create a skeleton time-aligned document in the correct format).

More technicalities about this process are discussed on the EthnoER Wiki page.

Comments

And check out Silvia's blog for more ideas about video on the web and a comment on EthnoER - only inaccuracy being that she has the post as written by Linda instead of Nick Thieberger.

What you're saying is completely true. I know that everybody must say the same thing, but I just think that you put it in a way that everyone can understand. I also love the images you put in here. They fit so well with what you're trying to say. I'm sure you'll reach so many people with what you've got to say.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Enter the code shown below before pressing post

The Authors

About the Blog

The Transient Building, symbolising the impermanence of language, houses both the Linguistics Department at Sydney University and PARADISEC, a digital archive for endangered Pacific languages and music.
More

FAQ

Papua New Guinea FAQs from Eva Lindstrom Papua New Guinea (New Ireland): Eva Lindstrom's tips for fieldworkers

Australian Languages Answers to some frequently asked questions about Australian languages

Papua Web Information network on Papua, Indonesia (formerly Irian Jaya)

Hibernating blogs

Indigenous Language SPEAK

Langguj gel Australian linguistics and fieldwork blog

Interesting Blogs

Omniglot Writing systems and languages of the world

LingFormant Linguistics news

Language hat Linguistics news and commentary

Jabal al-Lughat Linguistics news and commentary on a range of languages

Living languages Blog with news items and discussion of endangered languages

OzPapersOnline Notices of recent work on the Indigenous languages of Australia

That Munanga linguist Community linguist blog

Anggarrgoon Claire Bowern's linguistics and fieldwork blog

Savage Minds A group blog on Anthropology

Fully (sic)

Language on the Move Intercultural communication and multilingualism

Talking Alaska: Reflections on the native languages of Alaska

Culture matters: applying anthropology Australian anthropology blog: postgraduates and staff

Long Road ethnography and anthropology blog - including about Australia

matjjin-nehen Blog on Australian linguistics, fieldwork, politics and the environment.

Language Log Group blog on language and linguistics

Links

E-MELD The E-MELD School of Best Practices in Digital Language Documentation

Tema Modersmål Website in Swedish with links to sites on and in many languages

Hans Rausing Endangered Languages Project: Language Documentation: What is it? Information on equipment, formats, and archiving, and examples of documentation

Indigenous Peoples Issues & Resources a worldwide network of organizations, academics, activists, indigenous groups, and others representing indigenous and tribal peoples

Technorati Profile

Technology-enhanced language revitalization Include ILAT (Indigenous Languages and Technology) discussion list.

Endangered languages of Indigenous Peoples of Siberia

Koryak Net Information on the people of Kamchatka

Linguistic fieldwork preparation: a guide for field linguists syllabi, funding, technology, ethics, readings, bibliography

On-line resources for endangered languages

Papua New Guinea Language Resources Phonologies, grammars, dictionaries, literacy, language maps for many PNG languages

Resource network for linguistic diversity Networking practitioners working to record,retrieve & reintroduce endangered languages

Projects

ACLA child language acquisition in three Australian Aboriginal communities

DELAMAN The Digital Endangered Languages and Musics Archives Network

PARADISEC The Pacific And Regional Archive for Digital Sources in Endangered Cultures

Murriny-Patha Song Project Documenting the language and music of public songs and dances composed and performed by Murriny Patha-speaking people

PFED The Project for Free Electronic Dictionaries

DOBES Endangered language documentation and archiving, funded by the Volkswagen Foundation and sponsored by the Max Planck Institute, Nijmegen.

DELP Documenting endangered languages at the University of Sydney

Ethno EResearch Exploring methods and technology for streaming media and interlinear text