OLAC Record oai:catalogue.elra.info:ELRA-S0484 |
Metadata | ||
Title: | ATCO2 Project Data | |
Access Rights: | Rights available for: nonCommercialUse, commercialUse | |
Date Available (W3CDTF): | 2022-10-19 | |
Date Issued (W3CDTF): | 2022-10-19 | |
Description: | ATCO2 project aims at developing a unique platform allowing to collect, organize and pre-process air-traffic control (voice communication) data from air space. This project has received funding from the Clean Sky 2 Joint Undertaking (JU) under grant agreement No 864702. The JU receives support from the European Union’s Horizon 2020 research and innovation programme and the Clean Sky 2 JU members other than the Union. The project collected the real-time voice communication between air-traffic controllers and pilots available either directly through publicly accessible radio frequency channels or indirectly from air-navigation service providers (ANSPs). In addition to the voice communication data, contextual information is available in a form of metadata (i.e. surveillance data). The dataset consists of two distinct packages:- A corpus of ca. 4000 hours (untranscribed) of air-traffic control speech collected across different airports (Sion, Bern, Zurich, etc.) in .wav format for speech recognition. Speaker distribution is 90/10% between males and females and the group contains native and non-native speakers of English. The raw data, also provided, consists of:Overall size of the dataset (measured after Voice activity detection)- 5281 hours (English + non-English)- 4465 hours (English only)Overall raw size of audio files (sum of wav file lengths):- 6225 hours (English + non-English)- A corpus of ca. 4 hours (transcribed) of air-traffic control speech collected across different airports (Sion, Bern, Zurich, etc.) in .wav format for speech recognition. Speaker distribution is 90/10% between males and females and the group contains native and non-native speakers of English. This corpus has been manually transcribed and automatically annotated with orthographic information in XML format with speaker noise information, SNR values and others. Ca. 1 hour of annotation has followed a human re-checking. | |
Identifier: | ELRA-S0484 | |
ISLRN: 589-403-577-685-7 | ||
Identifier (URI): | https://catalog.elra.info/en-us/repository/browse/ELRA-S0484/ | |
Language: | English | |
Language (ISO639): | eng | |
Medium: | Not specified | |
Publisher: | ELRA (European Language Resources Association) | |
Type (DCMI): | Sound | |
Type (OLAC): | primary_text | |
OLAC Info |
||
Archive: | ELRA Catalogue of Language Resources | |
Description: | http://www.language-archives.org/archive/catalogue.elra.info | |
GetRecord: | OAI-PMH request for OLAC format | |
GetRecord: | Pre-generated XML file | |
OAI Info |
||
OaiIdentifier: | oai:catalogue.elra.info:ELRA-S0484 | |
DateStamp: | 2022-10-19 | |
GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
Citation: | n.a. 2022. ELRA (European Language Resources Association). | |
Terms: | area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text |