Resources for Speech Research
This page includes pointers to a number of resources for speech (and other psycholinguistic) research. Please feel free to add stuff!
There is also a local resources page which includes stuff useful to Edinburgh people.
- The Journal/Author Name Estimator
- Once you've finished your research project, where should you publish it?
- Experimental Linguistics in the Field
- This resource site has a bunch of pointers to useful software, ways of obtaining recordings outside the laboratory, etc.
(Non-)Words, Frequency, etc.
- The CELEX online lexical database
- For best results with CELEX, use Firefox.
- The Simple CELEX Interface
- A CELEX tool which sucks less, hosted at Frankfurt.
- The MRC Psycholinguistic Database
- Hosted at the Univesity of Western Australia, the MRC database contains information on linguistic and psychological properties for about 150,000 words (although by no means every property is listed for every word). A comprehensive resource for psycholinguistic experiments. (Martin Corley's frequency tools provide an alternative interface to the MRC database).
- Frequency Lists derived from the British National Corpus
- These very useful lists are created and made available by Adam Kilgariff. They appear to be collated from version 1 of the BNC.
- AoA, Imageability and Familiarity for 1,526 English words
- The Bristol Norms have been scaled to be compatible with the Gilhooly and Logie (1980) norms, included in the MRC Database.
- Frequency lists from (American) film and television subtitles
- These lists are argued by Marc Brysbaert and Boris New to be much better than many existing lists, especially for short words.
- Frequency lists from British BBC broadcasts
- Presented as Zipf values, these lists from Walter Van Heuven, Marc Brysbaert and colleagues, are especially useful for work in the UK.
- N-watch and other utilities
- N-watch is a utility for calculating neighbourhood statistics and other lexical measures (see this paper for more information). Versions of N-watch are also available in Spanish and Basque. Note that these utilities require Windows/WINE.
- The ARC Nonword Database
- Hosted at Macquarie University, this is an invaluable resource for lexical experiments.
- Affective Ratings for nearly 14 thousand English words (Warriner et al.)
- Hosted by Marc Brysbaert at Ghent University.
- a program which gives access to much-improved LSA-like semantic vectors.
And finally, you might just find something at Kevin's List of WordLists.
Images and Associated Information
- A large collection of images
- This collection includes “Snodgrass and Vanderwart-like” images (see here). Made available by Michael Tarr at Carnegie Mellon University.
- The International Picture Naming Project
- The UCSD Center for Research in Language is engaged in a large international study to provide norms for timed picture naming in seven different languages (American English, German, Mexican Spanish, Italian, Bulgarian, Hungarian, and the variant of Mandarin Chinese spoken in Taiwan). They currently have data for over 500 pictures.
- The Beckman Spoken Picture Naming Norms
- Norms on picture name agreement, etc., from Griffin & Huitema.
- The Bank of Standardized Stimuli (BOSS)
- a relatively new set of normed photographic images, described in this paper.
- The industry standard in phonetics. Praat is immensely powerful, but its user interface isn't the most intuitive.
- Praat and Perl sound editing scripts
- scripts for annotating, splicing, etc. Provided by Frankfurt Uni Phonetics Institute.
- BCBL tool for automatically annotating speech onset, with supposedly better performance than others.
- Used in conjunction with DMDX, checkvocal is a useful tool for checking voice onset times in naming-like tasks. For more information see Protopapas (2007).
- Audacity audio editor and recorder
- Audacity is a cross-platform audio editor which is powerful but easy to use.
- SoX sound exchange
- This cross-platform command-line utility is a “swiss army knife” of audio, allowing easy conversion between formats, resampling, adding of effects, etc.
- Zoom H2 handy recorder
- This is a recommendation for some hardware, which costs money. But for simple recording of responses or creation of audio materials, we've found the H2 hard to beat.
Statistics and Analysis
- Fast becoming the statistical environment of choice for psycholinguistics. There is an Edinburgh Psychology R wiki with lots of useful information. RSeek is a customised google interface which is invaluable for finding out all things R.
- alineR package
- R package to calculate ALINE (phonetic) distances between words. See Sadat et al. (2016).
- r-sig-mixed-models FAQ draft
- Amazingly useful FAQ for anyone doing mixed model analyses. Recommended reading.
- Spreedsheets for calculating MinF' and confidence intervals
- Struggling with stats requirements? Rob Hartsuiker has released couple of invaluable spreadsheets which take all the hard work out of the calculations.