My MA research project revolved around a simple question – can we turn down a music playback’s level slow enough that it will not be noticed by the listener. The motivation was to inform the design of a system that would turn down potentially dangerous playback levels to a safer range automatically and without diminishing the listener’s experience. To approach this goal, the experiment presented below was done to begin to understand what factors are important to achieve this desired transparent manipulation.
During the experiment, participants were given a playlist of short song excerpts, with a total length of about five minutes. During that time the playback levels were gradually reduced at different rates ranging from 0 dB per minute (i.e. no change) to -1, -2, -3, and -4 dB per minute. This means that some listeners heard music that remained at a constant level, while others heard music that was 20dB lower by the end of the 5 minutes (for the fastest reduction rate tested).
The participants were further divided into two groups: some were asked to track loudness throughout their time listening to the music, while the others were given a distracting visual task to complete. This means the 40 participants in total were divided into 10 different conditions. When asked at the end whether they heard a reduction in loudness, the answers collected were put into the breakdown given below. If a reduction was reported the answer was coded as 1, if no change was reported it was given a 0, and if a growth was reported this was coded as -1.
The data shown seem to support two things – that there could exist a rate of reduction of sound level that is imperceptible to listeners, and that listener attention greatly influences this rate. While the threshold ranges for each group can be narrowed down from this experiment, clear and specific boundaries are still hard to pin down within the limits of the data collected so far.
A quick explanation for the avoidance of claiming to find any definitive thresholds follows. The statistical strength of the data collected is not great, though not only because more data is needed for each condition tested, but it also seems that more conditions are needed to fully map out the ranges relevant to define the threshold rate of change for each group (steeper rates of reduction are needed for the distracted group, and slower rates for the non-distracted group). As an additional important technical note, the answers’ coding is problematic in that it assigns a range to the categorical answers the participants gave, which is incorrect, and this in turn makes the means presented here problematic as well. I’m presenting the graphs with this error to remain consistent with the paper and poster presentation made (see below), but this will certainly be addressed in any future iteration of this research.
The experiment done here was an initial attempt at approaching the larger problem, and an important learning process for me, and I do not consider it a strong conclusion for the project as such. A better design would have cut down on the number of conditions to make better use of the limited participant pool, as well as approached the data to be collected with a more statistically sound approach.
The experiment described above was presented at the 145th AES Convention in New York in October of 2018. For a more detailed account of the experimental design, the article (author’s version) submitted to the conference proceedings, before completion of data collection, can be found here:
Special thanks and gratitude are extended to Jonathan Berger, Madeline Huberth, Chris Chafe, Takako Fujioka, and Stephen McAdams for their advice and support in this project, and to Nick Gang for the audio data set used in this experiment. Facilities were used at CCRMA, Stanford University and at CIRMMT, McGill University and the experiment made use of facilities funded by the NSERC.
On a note beyond the specific experiment and its issues discussed here, the motivating mechanism described at the start does seem possible when looking at the data collected. As a bit of affirmation of my idea, I learned recently that BOSE has introduced a “Smart Volume Control” feature into a new product of theirs that does almost exactly what is described above – if the music goes above a certain level for a long enough period of time, the volume is automatically turned down to a safer range in a way they deemed imperceptible. When I had the chance to speak with a lead member of the project, Lee Zamir, he mentioned that they wait a bit of time before engaging the reduction to make it less noticeable, which seems to corroborate the strong effect of listener attention on the perceptibly of the change suggested in my experiment. Hopefully more companies in the consumer audio electronics market take the cue from the new feature BOSE implemented and begin to more seriously consider their customer’s long term hearing health as a strategically important consideration.