English intonation in the British Isles
Beta-Version of the Annotated IViE Corpus on CD-ROM
ESRC Award Number R000237145
This document contains copies of the text files which accompany the beta-version
of the IViE-corpus (released in 2000).
Current users of the beta-version: please note that some of the files have
been updated.
Read me first
About this corpus
How to see the data
Overview of files on the CD
The Stimuli (texts)
Keys to file name coding and speaker initials
Read Me First
The IViE beta-CD contains prosodically labelled data from the IViE corpus
(Economic and Social Research Council grant R000237145),
and the data can be viewed in xwaves(TM) under UNIX.
If you would like to see and hear data from the IViE Corpus, please
proceed as follows:
1. Start by reading the file 'About_this_corpus' (file included on the
CD - the web-version of the file is given below). In this file, you will find
information about the IViE corpus, the transcription system and the data
in this package.
2. Read 'How_to_see_the_data' next. This file tells how to view
prosodically labelled data from five urban varieties of British English,
how to transcribe unlabelled IViE data using the IViE system, and how to
use the IViE labeller to transcribe your own data
3. In 'Overview_files', you will find a key to the directory structure and
the filenames in this package.
4. In 'The_Stimuli', you find orthographic transcriptions of the data (but
note that you will also see orthographic transcriptions when you view the
data).
5. In 'IViE_labels', you find copies of the IViE labeller and the IViE
menus. Information about how to use the labeller to transcribe your own
data is given in the file 'How_to_see_the_data'.
Back to the top
About this Corpus
The IViE corpus was set up for the investigation of cross-varietal and stylistic variation in
British English intonation. The beta-version of the corpus contains machine-readable,
prosodically labelled speech data from five urban varieties:
- Belfast English
- British Punjabi English spoken in Bradford
- Cambridge English
- Leeds English
- Newcastle English
The data were collected in urban secondary schools, and the speakers are 16 years old.
We have recorded six male and six female speakers from each variety.
The complete IViE Corpus contains data in five speaking styles, and the
beta-version contains speech files from all speaking styles, and
prosodically labelled files from
(1) The controlled sentences
Data from Cambridge, Leeds, Bradford Punjabi, Newcastle and Belfast
5 syntactic structures, 30 speakers, 660 sentences
(2) The read passage
Data from Cambridge, Bradford, Belfast
One section each of the Cinderella fairy tale
(3) Additionally, there are unlabelled samples from
(3) the retold passage
(4) the map task
(5) the free conversation
produced by Cambridge, Bradford Punjabi, and Belfast English speakers.
The data were labelled with the IViE system for
prosodic labelling. For more information about the IViE labelling system, see the following
paper:
Grabe, E., Post, B., and Nolan, F. (to appear, preprint in .doc format).
Modelling intonational Variation
in English. The IViE system. Proceedings of Prosody 2000, Krakow, Poland,
October 2000.
If you have comments on the beta-version of the IViE corpus, please write to
Esther Grabe or
Brechtje Post.
If you would like to view labelled data from the IViE corpus, please proceed to the
next section.
Back to the top
How to see the Data
The IViE beta-CD allows you to
I. View prosodically labelled data from five urban varieties of British English
II. Transcribe unlabelled IViE data using the IViE system
III. Use the IViE labeller to transcribe your own data
The data can be viewed in xwaves under UNIX.
Information about the IViE system for prosodic labelling and our labelling guide can be
found
here.
How to make the Labeller Operational (UNIX, XWAVES)
Step 1 Create a directory IViE_Beta_Corpus
Step 2 Place the contents of this package into the directory IViE_Beta_Corpus
Step 3 Go to the /sentences/ directory
- type: cd sentences
Step 4 open the file label in a text editor (emacs, jot etc.)
- type: jot label
Step 5 find the following line:
TMP=/CDROM/IViE_labels/label$$
Edit this line.
Put in the directory path from YOUR machine which leads to the /sentences directory
e.g.
/home/myfiles/IViE_Beta_Corpus/sentences/label$$
To find out what the path is,
- type: pwd
at the command line while you're in the /sentences/ directory
Save your changes and quit the text editor.
Do the same in the file 'mlabel' which gives you an f0 range for male speakers.
Make the labellers executable by typing
chmod +x label
chmod +x mlabel
You're ready to use the labeller
Do the same in the directories /Cinderella_passage/ and /spontaneous/.
Here, the directory path in label and mlabel
has to lead to the directories /Cinderella_passage/ and /spontaneous/.
Viewing Transcribed Data
This package contains two directories with prosodically labelled data:
/sentences
/Cinderella_passage
The directory /spontaneous contains unlabelled spontaneous speech data for comparison.
(1) The Sentences
The sentence directory contains controlled sentences produced by six speakers (three
male and three female) from five different locations in the British Isles. There are five
different syntactic structures: simple statements, questions without morphosyntactic
markers, inversion questions, WH- questions and one type of coordination structure. All
sentences are labelled on two orthographic and three prosodic tiers. Data from each of the
five varieties can be found in their own subdirectory:
/sentences/Cambridge
/sentences//Leeds
/sentences//Bradford
/sentences//Newcastle
/sentences//Belfast
Each of these directories contains five further subdirectories in which you find the five
different syntactic structures, e.g.:
Directory name:------------Directory contains:
/Cambridge/statements-------8 different fully voiced statements
/Cambridge/Q_no_morph-----3 different fully voiced questions without morphosyntactic markers
/Cambridge/WH_questions----3 different fully voiced WH-questions
/Cambridge/inversions-------3 different fully voiced inversion-questions
/Cambridge/coordinations-----5 different fully voiced coordination structures, conjuction: 'or'
There are 132 prosodically labelled sentences from each variety.
How to View the Sentence Data
To see statement 1 produced by a male Belfast speaker, do the following:
Go to the /sentence directory.
To see an example of statement 1 produced by a male Belfast speaker
type mlabel Belfast/statements/s1bgm
('mlabel' starts up the labeller for a male speaker)
To see an example of statement 1 produced by a female Belfast speaker
type label Belfast/statements/s1bcc
('label' starts up the labeller for a female speaker)
You will see the pressure wave displayed at the top, the F0 trace at the bottom, and the
time-aligned labelling template in the middle.
The letters in the filename 's1bgm' mean: 's' = 'statement', '1' = 1, 'b' = Belfast, 'gm' =
initials of male Belfast speaker GM. 'cc' = initials of female Belfast speaker CC.
A list of the files, and keys to filenames, speaker initals and speaker gender are given in
the file 'Overview_files' in this package.
(2) The Cinderella Passage
The second directory contains data from the Cinderella passage. There are three sections
of the passage in this directory, one from Cambridge English, one from British Punjabi
English spoken in Bradford, and one from Belfast.
To see these data:
Go to the Cinderella_passage directory
type mlabel p1cma to see the Cambridge passage
(p1cma = passage section 1, Cambridge, male speaker MA)
type label p1pfm to see the British Punjabi English passage from Bradford
(p1pfm = passage section 1, Punjabi, female speaker FM)
type mlabel p1bgm to see the Belfast passage
(p1bgm = passage section 1, Belfast, male speaker GM)
Listening to the Spontaneous Speech Data
If you would like to hear some spontaneous speech from the IViE corpus, or if you would
like to label some IViE data using the IViE system, go to the directory
/spontaneous
Here you find spontaneous speech data from Cambridge, Bradford Punjabi and Belfast
English. The speakers are the same as in the controlled sentences and the passage task.
To hear a retold version of a section from the Cinderella passage produced by a
Cambridge speaker
- type mlabel rcma
(retold speech, Cambridge, male speaker MA)
or sgplay rcma.d
The labelling templates will be empty. In the following sections, you can find out how to
see the menus and how to insert IViE labels.
More spontaneous speech data
IViE LABELLING
LABELLING ON THE ORTHOGRAPHIC TIER
If you put the cursor in the lowest tier (the orthographic tier), and hold down the right
mouse-button, you will see a menu that allows you to insert, delete, replace or move
around words. Type the words spoken by the speaker, one by one, and align them with
the end of the words in the speech wave using the MOVE button (right-mouse-meanu).
After choosing MOVE, click the middle mouse button. The word which is closest to your
cursor will jump the to the location of the curser.
If the words have been inserted correctly, you can hear each word by clicking on the
word with the left mouse-button.
THE RHYTHMIC TIER
The second lowest tier (the rhythmic tier) is intended for the transcription of rhythmic
prominences. The right-mouse-menu offers three symbols: 'P' for prominence, '%' for
rhythmic boundary, and the hash sign for a hesitation, or a speech error. Insert P in the
middle of a prominent syllable, % at the end of a word that is followed by a rhythmic
boundary, and hash at the location of the hesitation or error.
For more for on this tier and all following tiers, see the
IViE labelling guide.
THE TONE TARGET TIER
The middle menu is intended for the transcription of pitch movement surrounding
prominent syllables. The menu contains a selection of pitch movement labels. The labels
given in the menu are no more than a SUBSECTION of possible pitch movement labels.
Generally, transcribers make up their own labels and type them into the tier using the
'insert' command. The % is used to indicate the end of a pitch movement implementation
domain (ID) which co-incides with an IP boundary. As IP boundaries are transcribed on
the phonological tier, some transcribers insert the % on the pitch movement tier only after
they have transcribed the intonational structure of the utterance on the phonological tier.
THE PHONOLOGICAL TIER
On the second highest tier, the intonational structure is labelled. The labels given in the
menu represent a pool of labels rather than a closed phonological system. Transcribers
can select different subsections of labels for different varieties of English. The labels
given allow us to account for the five varieties of English in this package.
THE COMMENT TIER
The highest tier is intended for alternative transcriptions and comments. Some options are
given in the right-mouse-menu. Otherwise, you can type in your own comments.
More about the Spontaneous Speech Data
To hear a retold version of a section from the Cinderella passage produced by a Bradford
Punjabi English speaker
type label rpfm
(retold speech, Punjabi, female speaker FM)
To hear a retold version of a section from the Cinderella passage produced by a Belfast
speaker
type mlabel rbgm
(retold speech, Belfast, male speaker GM)
To hear sections from the map task
type mlabel macma
(map task, section a, Cambridge, male speaker MA)
type mlabel mbcma
(map task, section b, Cambridge, male speaker MA)
type label mpfm
(map task, Punjabi, female speaker FM)
type label m1brg
(map task, Belfast, male speaker GM)
To hear a section of free conversation
type mlabel fcma
(free conversation, Cambridge, male speaker MA)
type label fpfm
(free conversation, section 1, Punjabi, female speaker FM)
type label fbgm
(free conversation, Belfast, male speaker GM)
How to Use the IViE Labeller to Transcribe Your Own Speech Data
Please note: we have updated the tone labels in the IViE labelling system; please
refer to the
IViE labelling guide.
If you would like to use the IViE labeller to transcribe your own data, you need to make a
separate directory on your machine in which you put:
(1) the speech files which are to be labelled
(2) f0 files made from the speech files in xwaves (you can type 'get_f0 infile outfile' to
make an f0 file from your speech file.) Speechfile and f0 files have to have the same
name, but different extensions. Give the speechfile the extension '.d', and the f0 file the
extension '.f0' (i.e. filename.d, and filename.f0).
(3) copies of the following files in this package:
label
mlabel
wordmenu
rhythmmenu
pitchmenu
tonemenu
commentmenu
You find these files in the directory 'IViE_labels'
When you've got all the relevant files in your directory, open the file 'label' using a text
editor (e.g. emacs or jot).
- type: jot label
then find the following line:
TMP=/CDROM/IViE_labels/label$$
Again, you need to change the directory path in this line. Otherwise, the labeller will not work in
your directory. Put in the directory path which leads to the directory you're in at the
moment.
e.g.
/home/myfiles/new_directory/label$$
To find out what the path is, type:
pwd
at the command line while you're in the directory with your speech files and the label
files
Save your changes and quit the text editor.
Do the same in the file 'mlabel' which gives you an f0 range for male speakers.
Make the labeller executable by typing
chmod +x label
chmod +x mlabel
You're ready to use the labeller.
To open the data files and the labelling template, type
label filename for a female speaker, and
mlabel filename for a male speaker
Do not add any extensions.
You should see the speech wave at the top of your screen, an empty labelling template
with 5 tiers underneath, and an f0 file at the bottom.
Finally, please note that the menu files and the labeller can be edited using a text editor.
For instance, you can change the labels in the tonemenu to suit your own transcription needs,
and you can add extra tiers or remove tiers if you edit the labelling script.
Back to the top
Overview of the Data Files on the CD
Sentence files can be viewed while you are in the /sentences directory.
Cinderella passage files are viewed from the /Cinderella_passage directory.
Spontaneous speech files are viewed from the /spontaneous directory.
I. The Sentences
Directory /sentences
Subdirectories /Belfast, /Bradford, /Cambridge, /Leeds, /Newcastle
Subdirectories within the directories for each variety are:
/statements, /Q_no_morph, /inversions, /WH_questions, /coordinations
(Q_no_morph = questions without morphosyntactic markers)
These subdirectories contain the data.
The directory /sentences/Belfast/statements, for instance, contains
s1bgm.extension
s2bgm.extension
s3bgm.extension
....etc.
Key to File Names
The first letter of the file name indicates the sentence type:
's' = statement, 'q' = question without morphosyntactic makers, 'w' = WH-question, 'i' =
inversion, 'c' = coordination,.
The following number shows which sentence was produced (see the file 'Stimuli' for
texts). The directory '/statement', for instance, contains 8 different statements.
The next letter indicates the variety.
'c' = Cambridge, 'l' = Leeds, 'p' = British Punjabi, 'n' = Newcastle,'b' = Belfast
The two letters at the end are the initals of the speaker, in this case speaker GM.
Key to Speaker Initials
Belfast: male speakers are GM, RG and RO, female speakers are CC, AM, EC
Cambridge: male: MA, MC, PT, female: LP, ER, SM
Leeds: male: JP, MD, RP, female: CM, KF, NB
Newcastle: male: AM, MC, RF female: EP, RP, VW
Bradford: male: AA, RH, WA, female: FM, KI, RA
How to View the Files
To view sentence files, you need to be in the /sentence directory. Choose a variety
(Cambridge, Belfast, Bradford, Newcastle, Leeds), a sentence type (statements,
Q_no_morph, inversions, WH-questions, coordinations), and a speaker (see list of
speaker initials above). For an inversion question produced by a female Bradford Punjabi
English speaker, type
label Bradford/inversions/i1pki
(i1pki = inversion question 1, Punjabi, speaker KI)
For a statement produced by a male Leeds English speaker, type
mlabel Leeds/statements/s1ljp
(s1ljp = statement 1, Leeds, speaker jp)
etc.
II. The Cinderella Passage
Directory /Cinderella_passage
Files: p1cma.extension
p1pfm.extension
p1bgm.extension
To see data from a male Cambridge English speaker, type
mlabel p1cma
To see data from a female Bradford Punjabi English speaker, type
label p1pfm
To see data from a male Belfast English speaker, type
mlabel p1bgm
II. Spontaneous Speech Data
In the directory /spontaneous, you find unlabelled data in 3 speaking styles from 3
different varieties. The speakers are the same as in the sentences and the passage task.
The /spontaneous directory contains:
(1) 3 retold versions of a section of the Cinderella passage (Cambridge, Bradford,
Belfast)
(2) 3 map task sections (Cambridge, Bradford, Belfast)
(3) 3 sections from our free conversation data (Cambridge, Bradford, Belfast)
To see the files, type the following:
(1) To hear semi-spontaneous (retold) speech data
mlabel rcma (Cambridge)
mlabel rbgm (Belfast)
label rpfm (Bradford)
(3) To hear Map Task data (the initials refer to one of the speakers only)
mlabel mcama (Cambridge, section a)
mlabel mcbma (Cambridge, section b)
mlabel mbgm (Belfast)
label mpfm (Bradford)
(3) To hear a conversation (the initials refer to one of the speakers in the pair)
mlabel fcma (Cambridge)
mlabel fbgm (Belfast)
label fpfm (Bradford)
Back to the top
The Stimuli
For a key to filenames for a particular stimulus
, please read the section keys to file names and
speaker initials.
I. Sentences
(1) Simple Statements.
1. We live in Ealing.
2. You remembered the lillies.
3. We arrived in a limo.
4. They are on the railings.
5. We were in yellow.
6. He is on the lilo.
7. You are feeling mellow.
8. We were lying.
(2) Questions without morphosyntactic markers:
1. He is on the lilo?
2. You remembered the lillies?
3. You live in Ealing?
(3) Inversion questions:
1. May I lean on the railings?
2. May I leave the meal early?
3. Will you live in Ealing?
(4) WH-Questions:
1. Where is the manual?
2. When will you be in Ealing?
3. Why are we in a limo?
(5) Coordinations
1. Are you growing limes or lemons?
2. Is his name Miller or Mailer?
3. Did you say mellow or yellow?
4. Do you live in Ealing or Reading?
5. Did he say lino or lilo?
II. The Cinderella Passage
Once upon a time there was a girl called Cinderella. But everyone called her Cinders.
Cinders lived with her mother and two stepsisters called Lily and Rosa. Lily and Rosa
were very unfriendly and they were lazy girls. They spent all their time buying new
clothes and going to parties. Poor Cinders had to wear all their old hand-me-downs! And
she had to do the cleaning!
One day, a royal messenger came to announce a ball. The ball would be held at
the Royal Palace, in honour of the Queenıs only son, Prince William. Lily and Rosa
thought this was divine. Prince William was gorgeous, and he was looking for a bride!
They dreamed of wedding bells!
When the evening of the ball arrived, Cinders had to help her sisters get ready.
They were in a bad mood. They'd wanted to buy some new gowns, but their mother said
that they had enough gowns. So they started shouting at Cinders. 'Find my jewels!'
yelled one. 'Find my hat!' howled the other. They wanted hairbrushes, hairpins and hair
spray.
When her sisters had gone, Cinders felt very down, and she cried. Suddenly, a
voice said: 'Why are you crying, my dear?'. It was her fairy godmother!
The girl poured her heart out: 'Lily and Rosa have it all!' she cried, 'even though
they're awful, and fat, and they're dull! And I want to go to the ball, and meet Prince
William!'
'You will, wonıt you?' laughed her fairy godmother. 'Go into the garden and
find me a pumpkin'. Cinders went, and found a splendid pumpkin which the fairy
changed into a dazzling carriage.
'Now bring me four white mice,' the godmother said. The girl went, and found
one... two...three...four mice. The fairy godmother changed the mice into four lovely
horses to pull the carriage.
Then the girl looked at her old rags. 'Oh dear!' she sighed. 'Where will I find
something to wear? I don't have a gown!' 'Hmmm...' said the fairy : 'Let's see, what do
you need? You'll need a ballgown... you need jewellery... you need shoes, and...
something needs to be done about your hair. And would you like a blue gown or a green
gown?'
For the third time, Cinders' godmother waved her magic wand. A ballgown, a
robe and jewels appeared. And there were some elegant glass slippers.
'You look wonderful,' her fairy godmother said, smiling. 'Just remember one thing - the
magic only lasts until midnight!' And off Cinders went to the ball.
In the Royal Palace, everyone was amazed by the radiant girl in the beautiful
ballgown. 'Who is she?' they asked. Prince William thought Cinders was the most
beautiful girl he had ever seen. 'Have we met?' he asked. 'And may I have the honour
of this dance?'
Prince William and Cinders danced for hours. Cinders was so glad that she failed
to remember her fairy godmotherıs warning. Suddenly the clock chimed midnight!
Cinders ran from the ballroom. 'Where are you going?' Prince William called. In
her hurry, Cinders lost one of her slippers. The Prince wanted to find Cinderella, but he
couldn't find the girl. 'I don't even know her name,' he sighed. But he held on to the
slipper.
After the ball, the Prince was resolved to find the beauty who had stolen his heart.
The glass slipper was his only clue. So he declared: 'The girl whose foot will fit this
slipper shall be my wife'. And he began to search the kingdom.
Every girl in the land was willing to try on the slipper. But the slipper was always
too small. When the Royal travellers arrived at Cinders' home, Lily and Rosa tried to
squeeze their feet into the slipper. But it was no use; their feet were enormous!
'Do you have any other girls?' the Prince asked Cinders' mother. 'One more,'
she replied. 'Oh no,' cried Lily and Rosa. 'She is much too busy!' But the Prince
insisted that all girls must try the slipper.
Cinders was embarrassed. She didn't want the Prince to see her in her old apron.
And her face was dirty! 'This is your daughter?' the Prince asked, amazed. But then
Cinders tried on the glass slipper, and it fitted perfectly!
The Prince looked carefully at the girl's face, and he recognised her. 'It's you, my
darling isn't it?' he yelled. 'Will you marry me?' Lily and Rosa were horrified. 'It was
you at the ball, Cinders?' they asked. They couldn't believe it! Then Cinders married
William, and they lived happily ever after.
Back to the top