Difference between revisions of "Audio"

Revision as of 14:23, 28 April 2025

When using audio in your experiment, especially when presenting time-critical stimuli, special care should be taken to optimize the audio settings on multiple levels (hardware, OS, script), as many things can go wrong along the way.

This page outlines some best practices, however we advise to always consult a TSG member if you plan to run an audio experiment in the labs.

Recording

When recording audio for stimuli material or as input for your experiment, please:

Use a high quality microphone, with a polar pattern suitable for your application.
Use a high quality recorder or audio interface, capable of recording at 24bit and 48kHz or higher.
Place the microphone at an appropriate distance from your subject. Set the levels so the audio does not clip (exceeding maximum volume).
Record in a quiet environment.

You can use our Sound Recording Labs for high quality voice recording.

Editing

We recommend using Audacity for editing and converting audio files. Audacity is open-source and fairly easy to use, available here: https://www.audacityteam.org/

Export Settings

We recommend using the following export settings:

File format: .wav (PCM).
Sample Frequency: 48kHz.
Bit depth: 16 bit.

The Lab Computer audio output is also set to 16 bit, 48kHz. We found that this is good enough for most applications; higher settings will increase file size with limited perceivable quality gains.

When using multiple audio files in your experiment, make sure they all use the same settings for consistent playback in your experiment.

In Audacity, you can set up Macros to automate processing and exporting your audio files: https://manual.audacityteam.org/man/macros.html

Windows Settings

Windows 10 has a habit of automatically enabling audio enhancements when connecting new speakers or headphones. These "enhancements" can distort your audio and cause timing issues. Therefore, please make sure they are turned off:

Right click sound icon on taskbar (next to clock) -> Sounds
Goto Playback tab. Select your audio output device and click "Properties"
Goto Enhancements tab. Make sure "Disable all enhancements" is checked.
Click Apply.

Playback

Psychopy

This is an example of a Python script that plays a .wav file with high time accuracy.

 1from psychopy import sound, core
 2from psychopy import prefs
 3prefs.hardware['audioLib'] = ['PTB']
 4
 5# Path to audio file
 6audio_file = "voice.wav"
 7
 8# Load audio
 9# preBuffer – integer to control streaming/buffering -1 means store all 
10audio = sound.Sound(audio_file,preBuffer=-1)
11
12# Play audio
13audio.play()
14
15# Wait for audio to finish playing
16core.wait(audio.getDuration())
17
18# Close audio
19audio.stop()
20audio.close()

@@ Line 1: / Line 1: @@
-When using video in your experiment, especially when presenting time-critical stimuli, special care should be taken to optimize the video and audio settings on multiple levels (hardware, OS, script), as many things can go wrong along the way.
+When using audio in your experiment, especially when presenting time-critical stimuli, special care should be taken to optimize the audio settings on multiple levels (hardware, OS, script), as many things can go wrong along the way.
-This page outlines some best practices; however, we advise to always consult a TSG member if you plan to run a video experiment in the labs.
+This page outlines some best practices, however we advise to always consult a TSG member if you plan to run an audio experiment in the labs.
 ==Recording==
-When recording video for stimulus material or as input for your experiment, please:
+When recording audio for stimuli material or as input for your experiment, please:
-Use a high-quality camera, with settings appropriate for your application (e.g., frame rate, resolution).
+* Use a high quality microphone, with a [https://www.audio-technica.com/en-us/support/a-brief-guide-to-microphones-whats-the-pattern/ polar pattern] suitable for your application.
-Use a high-quality recorder or capture device, capable of recording at 1080p (1920×1080) and 60fps or higher.
+* Use a high quality recorder or audio interface, capable of recording at 24bit and 48kHz or higher.
-Stabilize the camera and avoid automatic exposure, white balance, or focus during recording to prevent inconsistencies.
+* Place the microphone at an appropriate distance from your subject. Set the levels so the audio does not clip (exceeding maximum volume).
-Record in a controlled environment with consistent lighting and minimal background distractions.
+* Record in a quiet environment.
-You can use the facecam for high quality video recording.
+You can use our [[Sound Recording Lab]]s for high quality voice recording.
 ==Editing==
-We recommend using DaVinci Resolve for editing and converting video files. DaVinci Resolve is a free, professional-grade editing program, available here: https://www.blackmagicdesign.com/products/davinciresolve
+We recommend using Audacity for editing and converting audio files. Audacity is open-source and fairly easy to use, available here: https://www.audacityteam.org/
-Alternatively, you can use Shotcut, a simple open-source editor, available here: https://shotcut.org/
+===Export Settings===
+We recommend using the following export settings:
+* File format: .wav (PCM).
+* Sample Frequency: 48kHz.
+* Bit depth: 16 bit.
-===Video Settings===
+The [[Lab Computer]] audio output is also set to 16 bit, 48kHz. We found that this is good enough for most applications; higher settings will increase file size with limited perceivable quality gains.
-We recommend using the following settings:
-File format: .mp4 (H.264 codec(libx264))
-Frame rate: 60 fps (frames per second)
-Resolution: 1920×1080 (Full HD) or match your experiment's display settings
-Bitrate: 10-20 Mbps for Full HD video
-Constant Frame Rate (CFR): Always enforce a constant frame rate.
-   Example: -vsync cfr in ffmpeg.
+When using multiple audio files in your experiment, make sure they all use the same settings for consistent playback in your experiment.
-The [[Lab Computer]] displays are typically set to 1920×1080 at 120Hz. We found that this is sufficient for most applications. There are possibilities to go higher.
+In Audacity, you can set up Macros to automate processing and exporting your audio files: https://manual.audacityteam.org/man/macros.html
 ==Windows Settings==
-Windows 10 has a habit of automatically enabling '''video enhancements''' or unnecessary processing features, which can interfere with smooth playback. Therefore, please make sure these are disabled:
+Windows 10 has a habit of automatically enabling '''audio enhancements''' when connecting new speakers or headphones. These "enhancements" can distort your audio and cause timing issues. Therefore, please make sure they are turned off:
+# Right click sound icon on taskbar (next to clock) -> Sounds
+# Goto Playback tab. Select your audio output device and click "Properties"
+# Goto Enhancements tab. Make sure "Disable all enhancements" is checked.
+# Click Apply.
-Open Settings → System → Display → Graphics Settings.
-If available, disable "Hardware-accelerated GPU scheduling" for critical timing experiments.
-For specific applications (e.g., PsychoPy), under "Graphics Performance Preference," set them to "High Performance" to ensure they use the dedicated GPU.
 ==Playback==
-=== PsychoPy ===
+=== Psychopy ===
-This is an example of a Python script that plays a .mp4 video file with high time accuracy. <syntaxhighlight lang="python" line> from psychopy import visual, core, prefs prefs.hardware['videoLib'] = ['avbin', 'ffpyplayer'] # Choose based on installed libraries
+This is an example of a Python script that plays a .wav file with high time accuracy.
+<syntaxhighlight lang="python" line>
+from psychopy import sound, core
+from psychopy import prefs
+prefs.hardware['audioLib'] = ['PTB']
-Create a window
+# Path to audio file
-win = visual.Window(fullscr=True, monitor="testMonitor", units="pix")
+audio_file = "voice.wav"
-Path to video file
+# Load audio
-video_file = "stimulus.mp4"
+# preBuffer – integer to control streaming/buffering -1 means store all
+audio = sound.Sound(audio_file,preBuffer=-1)
-Load video
+# Play audio
-movie = visual.MovieStim3(win, video_file, size=(1920, 1080), flipVert=False, flipHoriz=False, loop=False)
+audio.play()
-Play video
+# Wait for audio to finish playing
-while movie.status != visual.FINISHED: movie.draw() win.flip()
+core.wait(audio.getDuration())
-Close window
+# Close audio
-win.close() core.quit() </syntaxhighlight>
+audio.stop()
+audio.close()
+</syntaxhighlight>