Progress Report 16/08/2001
Release of speech data from the IViE corpus
We had a very positive response to the release of the speech data
from the IViE corpus last month; we received over 70 requests in
the first week. We have now run out of CD-packs,
but the complete corpus is available on the web (see URLs below).
NB: Information for colleagues who have ordered a CD-pack: all CD-sets
have been burned and packaged (12/08/01) and they'll be sent out this
week.
Here are the URLs for the on-line versions of the corpus:
Audio Page: Searching the corpus, listing to individual files and downloading
individual files
Download Page
: Downloading of packs of data from the corpus, sorted by variety and
speaking style (.tar)
Update on the annotated IViE CD
Labelling is (again) in progress. The annotated
CD which we will publish early next year wil contain a selection
of data from seven varieties: Belfast, Bradford, Cambridge, Dublin,
Leeds, Liverpool, London and Newcastle
and five speaking styles.
Where we are now:
We have completed the
labelling of
(1) the sentence data produced by three male and three female speakers
from the seven varieties listed above
(approximately 90 minutes of speech)
(2) the read speech data
(three male and three female speakers,
one section from the Cinderealla passage, seven varieties;
approximately 35 minutes of speech).
Additionally,
we have labelled complete Cinderella passages from 6 Belfast and 6 Cambridge
speakers (approximately 50 minutes of speech)
A one-minute file of read or semi-spontaneously produced
speech data can be labelled in approximately one hour; one
minute of interactive speech data requires approximately two hours.
Conferences
Esther and Brechtje gave a talk about
the IViE project at uklvc
in York.