How to view and search within the BNC Spoken Audio files using Praat
You can display an audio waveform, spectrogram and the corresponding
transcriptions using Praat. (If you don't have Praat, it's freely
available from here.) You'll need the audio file (say, 021A-C0897X0081XX-AAZZP0.wav) and the corresponding transcription in Praat TextGrid format (say, D92-0081.TextGrid).
1. Start the Praat software (click on its icon).
2. Ignore the "Praat Picture" window. (You can minimize it if you want to put it out of the way.)
3. Look at the line of menus at the top of the "Praat Objects" window.
Select "Read" then "Read from file"; a file browser window labelled
"Read Object(s) from file" will appear.
4. Browse around until you've found the .wav file, wherever you've put
it on your computer. With the .wav file name (including its complete
location pathname) in the Selection box at the bottom of the, click on
"OK". The list of "Objects:" at the upper left of the "Praat Objects"
window should read:
1. Sound 021A-C0897X0081XX-AAZZP0
5. Repeat steps 3 and 4 to load the corresponding .TextGrid file into
Praat. The TextGrid will appear as the second item in your list of
Praat Objects.
6. Click on object 1 (the sound file name); it should turn to white text on a black background.
7. Enter SHIFT+DownArrow simultaneously; object 2 (the TextGrid name)
will also turn to white text on a black background. This indicates that
you have selected both files together.
8. Click on the "Edit" button at the upper right of the Praat Objects
window. After a brief delay, a new window will pop up that displays the
audio waveform, a spectrogram (or a placeholder for a spectrogram below
it), and then the transcription tiers below that. In our Spoken Audio
BNC TextGrids, the upper transcription tier (tier 1) contains intervals
labelled with phoneme labels; tier 2, at the bottom, contains words in
ordinary spelling (in capital letters).
9. The buttons labelled "all", "in", "out", "sel" and "bak" at the
bottom left, and the scroll bar to their right, enables you to zoom in
and out and to move forwards and backwards through the audio.
To search for a particular word
(You can find out what words occur in each Spoken Audio BNC sampler
file by looking in the HTML transcription. Let's suppose that you look
at
http://www.phon.ox.ac.uk/SpokenBNCdata/D92.html and you want to hear how the word "Hertford" is pronounced.)
10. Using the "in" button, zoom in until you can the text of the
transcriptions. Then, move the scroll bar to the extreme left, so that
you are looking at the start of the audio file.
11. In the bottom tier, click on the leftmost 'word'. (In the current
example, this is labelled "sp", which stands for "short pause".)
12. Enter "CONTROL-F". A small window labelled "Find text" pops up.
13. Point and click in the text box, to make it active. Then, type in
the word you want to find. To agree with the transcriptions, this
should all be in capitals, i.e. HERTFORD. Click on the button labelled
"OK".
14. Assuming everything is in order (i.e., you typed correctly, the
word tier has been selected and the word you want is actually in the
TextGrid), Praat will move the view to the relevant portion of the
audio file. The found word will be in red text on an orange background,
and the phonemic transcription and the selected portion of audio will
have a pink background.
15. The duration of the selected word will be indicated in a grey box
immediately below the word transcription. Either by clicking on that
grey box, or by pressing the TAB key, the selected section of audio
should play.
(Sometimes, you may get an error message saying that the audio playback
doesn't work. The most common reason for this is that some other window
or application on your computer has control over the audio playback. To
fix that, close any other applications that use sound, including
non-Praat windows that are viewing sound files.)
Please note: just because it finds a
portion of audio for the word you have entered doesn't guarantee that
the alignment is good, i.e. in the right place! If you're lucky, it
will be just right. But frequently, the alignment is somewhat or VERY
inaccurate. Even then, however, you may find that you've been
taken to ROUGHLY the right portion of the file. If you zoom out a bit
and use your cursor to select a larger section of the audio waveform in
the upper part of the display, you can listen to a wider portion of the
speech. By referring to the corresponding HTML transcription file, you
might be able to work out where you are in the audio file.
To close the audio file or TextGrid (for example, if you want to inspect a different audio file)
16. In the "Praat Obects" window, selecting the Sound or
TextGrid objects and clicking on the button labelled "Remove" (in red)
at the bottom will close those files, and also will close the display.
Go back to step 3, above, if you want to look at another audio file.