ANDS Logo

Project Homepage:

http://www.ausnc.org.au

Project Members:

Robyn Rebollo (Project Manager, r.rebollo@griffith.edu.au)

Mark Fallu (Lead Developer, M.Fallu@griffith.edu.au)

Gerhard Weis (Developer, g.weis@griffith.edu.au)

Mark Foreman (Developer, m.foreman@griffith.edu.au)

Joanne Morris (Data Source Administrator, j.morris@griffith.edu.au)

ANDS Contact:

Xiaobin Shen (Xiaobin.Shen@ands.org.au)

Project Status:

Completed

Australian National Corpus

Griffith University

Collaborator(s): Macquarie University

Project Description:

"Establishment of an Australian National Corpus that:
- Aggregates data from existing corpora residing at Australian universities, to provide a diverse and accurate representation of written and spoken languages in Australia.
- Allows the discovery, access and deposition of written and spoken data
- Allows linguistic researchers to collaboratively apply textual annotations to written and spoken data
- Is multimodal: supports text, multimodal text, audio and AV
- Is multilingual: English in Australia, Indigenous languages, community languages and sign languages
- Increases the potential for re-use of linguistic research datasets, and enables new research opportunities
- Is harvestable by the Australian Research Data Commons (ARDC)"

Data Type:

"? Is multimodal: supports text, multimodal text, audio and AV ? Is multilingual: English in Australia, Indigenous languages, community languages and sign languages Collections include the Griffith Corpus of Spoken Australian English, Monash Corpus of Australian English, Corpus of Oz Early English, Australian Corpus of English, Australian Component of the International Corpus of English, the Mitchell-Delbridge Tapes, and a portion of the Australian National Dictionary Centre corpus "

High Level Software Functionality:

Features: "? A web site, or web portal with search, deposit and annotation features
? A data store, which will house written and spoken data in a myriad of formats
? An annotation service, which will allow linking of annotations to spoken and written data
? A metadata store, for associating detailed metadata about written and spoken data in the (ANC). This service is essential for data discovery and interpretation of data
? Australian National Corpus collections will be ingested to Metadata Hub; ANDs will harvest from Griffith Metadata Hub
";

ANZSRC-FOR code:

20 LANGUAGE
COMMUNICATION AND CULTURE