Longman Corpus Network
The BNC Spoken Corpus
WHAT IS THE BNC SPOKEN CORPUS?
As part of a major collaborative research project called the British National Corpus which collected over 100 million words of written and spoken English, Longman has develop a 10 million word spoken corpus. The Spoken Corpus consists of natural, spontaneous conversations heard all around us and from the language of lectures, business meetings, after dinner speeches and chat shows. This is the first time that spoken English has ever been recorded in any systematic way on such a huge scale and now lexicographers and linguists have their first opportunity to study English as it is spoken, the English that is found in the street.
HOW WAS THE LANGUAGE COLLECTED?
An independent market research agency was commissioned to make a carefully selected cross-section of British English speakers in Great Britain. collected spontaneous conversations from a representative sample of the population in terms of age, gender, social group and region.
WIRED FOR SOUND
Each person selected was given a small Walkman and a microphone and asked to record all the speech he or she may hear or take part in over a one week period. As they are guaranteed total anonymity, the participants soon forget about the recording equipment, a fact which is reflected in the (often quite humorous) contents and style of the conversations. The tapes were sent back to us at Longman, transcribed and entered onto the computer. The total number of participants is over two thousand.
WHAT DOES THE SPOKEN CORPUS TELL US?
The actual analysis of the Spoken Corpus is just beginning, but in this relatively short time span a number of important facts have already come to light. The use of idioms, for example, is far more widespread in spoken language than it is in written language. An idiom such as flash in the pan which already occurs several times in the Spoken Corpus, doesn't appear at all in the 30 million word Longman/Lancaster Corpus.
What is so unique about the Spoken Corpus is that it shows us how we really use English, not how we are supposed to use English or how we use it when we are writing. It reveals how very different the spoken word is from the written word. At last students will be able to study English in an exciting new range of ELT materials that represent English as it really is. For the first time , real spoken language has influenced the creation of a learner dictionary. The Longman Dictionary Of Contemporary English is the only dictionary to recognise the importance of spoken English, showing the words and phrases used to communicate naturally in spoken English.
The British National Corpus project is a collaborative initiative carried out by Oxford University Press, Longman Group UK Ltd, Chambers, Lancaster University's Unit for Computer Research in the English Language, Oxford University Computing Services and the British Library. The project receives funding from the UK Department of Trade and Industry and the Science and Engineering Research Council within their Joint Framework for Information Technology.
Back to top