Excerpt With Tap

New Results

Crush-Relevant Signals in Auditory Cortical Responses to Musical Excerpts

doi: https://doi.org/10.1101/481473

Abstract

Musical beat perception is widely regarded as a loftier-level power involving widespread coordination across brain areas, but how low-level auditory processing must necessarily shape these dynamics, and therefore perception, remains unexplored. Previous cross-species piece of work suggested that beat perception in simple rhythmic racket bursts is shaped by neural transients in the ascending sensory pathway. Here, we constitute that depression-level processes even substantially explain the emergence of beat in real music. Firing rates in the rat auditory cortex in response to twenty musical excerpts were on average higher on the beat than off the beat tapped past human listeners. This "neural accent" distinguished the perceived beat out from culling interpretations, was predictive of the degree of consensus beyond listeners, and was deemed for past a spectrotemporal receptive field model. These findings signal that low-level auditory processing may accept a stronger influence on the location and clarity of the trounce in music than previously thought.

Introduction

The perception of a steady pulse or crush in music is a curious phenomenon that arises from the interaction betwixt rhythmic sounds and the way our brain processes them. There are two things that make musical trounce perception peculiarly intriguing. Firstly, no mammalian species apart from humans consistently show spontaneous motor entrainment to the beat in music (e.k. tapping a foot, nodding the caput, moving the body)^i–iv. Secondly, despite beat being a subjective percept rather than an acoustic characteristic of music, individual listeners tend to overwhelmingly agree on where the beat is. Some of this consistency might exist due to sure "peak-downwardly" constraints such equally cultural and cognitive priors^5–7. However, autonomously from theory^8,9, relatively piffling is known virtually the neurophysiological dynamics that crusade the feeling of musical beat out to emerge in the first place.

A fundamental piece of information currently lacking is which aspects of the neural representation of music might exist important for the consecration of beat. Previous cross-species work revealed that firing rates as early on as the auditory midbrain are significantly college on the vanquish than off the beat in simple rhythms constructed from identical broadband noise bursts¹⁰. If large firing rate transients resulting from low-level auditory processing are indeed necessary for the induction of beat, then this insight could shed light on the dynamics of the entrainment of cortical oscillations to beat^11–16, the office played past the motor system^8,17–26, and why dissimilar species differ so much in their shell perception and synchronization capacity²⁷.

Importantly, if a consequence of auditory processing is to create points of neural emphasis that predispose beats being felt there, so nosotros should observe this not just for simple rhythmic "laboratory sounds," but also for real music. Twenty musical excerpts²⁸, which were diverse in tempo and musical genre, were played to 3 anesthetized rats while recording extracellularly from auditory cortex. In line with previous findings, population firing rates were higher on the beat than off the beat, and big on-beat to off-beat firing rate ratios were a distinguishing feature of the consensus beat estimation beyond human listeners. Comparison with the output of an auditory nerve model revealed that small-scale effects may already be nowadays at the auditory periphery simply are amplified substantially in cortical responses. Musical excerpts that evoked a larger cortical on-beat accent likewise showed a stronger consensus in tapping behavior across listeners. Finally, these results could exist accounted for by the spectrotemporal receptive field properties of recorded units. These findings add together to growing bear witness that beat perception is non entirely culturally adamant, but is besides heavily constrained by low-level auditory processing common to mammals.

Results

Neural activity from a total of 98 unmarried and multi-units were analyzed in response to 12 repeats of the showtime x seconds of 20 musical excerpts taken from the MIREX 2006 dataset online, which included beat annotations fabricated by 40 man listeners²⁸. In all songs, listeners reported a steady beat well within the first 10 due south. The almost common tapping pattern for each excerpt was taken to exist that extract's "consensus" trounce interpretation (see Methods), and consensus borer rates ranged from 0.7 Hz to 3.seven Hz (42 to 222 beats per minute, corresponding to crush periods of ane.42 down to 0.27 s). The analyses that follow investigate correspondences between firing rates in the rat auditory cortex around the consensus vanquish as reported by human being listeners.

Auditory cortical firing rates are higher on the beat than off the shell

For each song, the 100 ms time window following each consensus tap was defined as on-beat, and all fourth dimension excluding these on-trounce windows was defined as off-beat (the results are not sensitively dependent on this precise definition, see Methods). Fig 1A shows the boilerplate on-vanquish population firing rate plotted confronting the off-beat population firing rate for each of the 20 tested musical excerpts. On-beat firing rates were significantly larger than off-crush firing rates (p<10⁻⁴, Wilcoxon paired signed-rank exam, N=xx songs), an observation that is consistent with previous piece of work examining gerbil midbrain responses to uncomplicated rhythmic patterns¹⁰. The crush-triggered average population firing charge per unit in the 200 ms window around consensus beats (averaged across all beats in all excerpts) provides a more detailed pic of population neural activity around the beat (Fig 1B). The distribution of on:off-beat ratios (OORs; average on-beat firing rate divided by average off-vanquish firing charge per unit) for each recorded unit (N=98) is shown in Fig 1C. An OOR > 1 indicates that firing rates were higher on the beat than off the beat. Most units evidence an OOR > i, and the bimodal distribution suggests that there may be distinct sub-populations in the recorded data, one with OORs centered effectually one and the other with OORs around 1.5.

Fig 1.

Download effigy
Open in new tab

Fig 1. Consensus beat out-triggered neural action and on:off-shell ratios (OORs) in the auditory cortex and auditory nerve.

(A) Mean on-crush versus off-beat population firing rate in auditory cortical neurons. Each dot is one musical excerpt. On-beat out firing rates are significantly higher than off-beat firing rates (p<10⁻⁴, Wilcoxon paired signed-rank test, N=20 songs) (B) Population "crush-triggered" boilerplate firing rate in the auditory cortex in a 200 ms window around the consensus vanquish times ± standard departure across the 20 musical excerpts. (C) Histogram of on:off-trounce firing rate ratios (OORs) for each recorded unit of measurement (N = 98), where "on-beat out" is the average firing charge per unit during the 100 ms mail service-tap window, and "off-beat" is the boilerplate firing rate over the unabridged song excluding on-shell windows. (D) Aforementioned as A, simply for population activity based on an auditory nerve model with fifty log-spaced frequency channels between 150 Hz and 24 kHz. Predicted firing rates at the auditory nerve were significantly higher on the beat than off the beat (p<0.005, Wilcoxon paired signed-rank test, N=twenty songs) (E) Aforementioned as B, simply for population action based on the auditory nerve model. (F) Same as C, but for auditory nerve model fibers (Due north=50).

For comparing, an auditory nerve model²⁹ was used to predict firing rates at the auditory nerve for 50 logarithmically spaced frequency channels between 150 Hz and 24 kHz. Fig 1D shows predictions of on-beat versus off-beat population action at the auditory nerve. Notably, the auditory nerve model would also predict higher average population firing rates on the beat than off the beat (p<0.005, Wilcoxon paired signed-rank test, N=20 songs). Fig 1E–1F show beat-triggered averages and OORs for auditory nerve model fibers. OORs based on the auditory nerve model, though significantly larger than one, are much smaller than cortical OORs (p<x⁻⁴, Wilcoxon paired signed-rank test, N = 20 songs).

A large neural emphasis is a distinguishing feature of the consensus trounce

While we have shown that firing rates are higher on the beat than off the beat, this on its own does not imply that large OORs are necessarily relevant to vanquish perception. From a purely bespeak processing perspective, a musical excerpt could theoretically be perceived as having any combination of tempo and time signature, and if most of these possible alternative crush interpretations were associated with more than or less equally large OORs, then big OORs would be of fiddling value equally physiological markers of musical shell. Therefore, if a big OOR is relevant for the induction of beat, we hypothesized that information technology should be large for the consensus trounce relative to plausible alternatives.

To exam this, we computed hypothetical OORs for the full range of plausible beat period and phase combinations. For each song, possible vanquish periods (representing the different rates at which a listener might tap) were allowed to range from 0.2 s to two s (5 Hz downward to 0.5 Hz) sampled in 20 ms steps. Likewise, for each beat period, the phase showtime was allowed to range from 0 up to the full trounce catamenia sampled in xx ms steps to capture the fact that two listeners tapping at the same rate may nevertheless exhibit different interpretations of the beat if their taps, rather than being synchronous, have a constant kickoff betwixt them. The OOR was and then computed for each of these beat interval and vanquish get-go combinations, resulting in 4,995 possible OOR values for each musical excerpt. The heatmaps in Fig 2A and 2B show the computed set of plausible OOR values calculated from cortical and auditory nerve model firing rates, respectively, for an example musical excerpt, with possible beat out periods on the y-axis and possible starting phase offsets on the x-axis (see Supplementary Figs S1-S2 for heatmaps of all musical excerpts). The histograms in Fig 2C and 2D pool together hypothetical (in grayness) and consensus (in red) OOR values from all musical excerpts (histograms for individual excerpts in Supplementary Fig S3-S4).

Fig 2.

Download effigy
Open in new tab

Fig two. How does the consensus beat compare with other possible vanquish structures?

(A) Heatmap depicting cortical on:off-beat out ratios for plausible beat period (y-axis) and beat phase offset (x-axis) combinations between 200 ms and 2 s (or tap rates of five Hz down to 0.5 Hz) for one example musical excerpt. Color indicates the OOR value. (B) Same as A, but for population activity based on the auditory nerve model. (C) Histogram pooled across musical excerpts of all OOR values (gray), and consensus OOR values (red) in the auditory cortex. (D) Same as C, but based on OOR values from the auditory nerve model.

If the OOR is a distinguishing feature of the perceived crush, we would look it to rank above the 50^th percentile of the underlying distribution of hypothetically plausible OORs for a given musical excerpt. Every bit hypothesized, the consensus OORs rank significantly larger than the 50^th percentile, both in the auditory cortex (p<ten^−iv, Wilcoxon signed-rank examination, N = 20 songs), and in the auditory nerve model (p<0.005). However, the percentiles were significantly larger in the auditory cortex than in the auditory nerve model (p<0.005, Wilcoxon paired signed-rank test, Due north = 20 songs). Notably, 14 out of the twenty musical excerpts tested had consensus OORs above the 95^th percentile in the auditory cortex, in contrast to only 7 out of 20 based on the auditory nervus model. Additionally, fewer hypothetical beat interpretations resulted in big OORs in the auditory cortex, as evidenced by the higher skewness, or longer correct tails, of the OOR distributions in the auditory cortex compared to those based on the auditory nerve model (p<10^−four, Wilcoxon paired signed-rank test, N = 20 songs). Together, these results advise that a large OOR is a feature that distinguishes the consensus beat from most other possible beat out structures, and that ii important consequences of auditory processing might be an amplification of pocket-size differences in OOR already present at the auditory periphery, and a farther restriction of the candidate trounce interpretations that would outcome in large OORs.

The stronger the on-shell neural emphasis, the stronger the tapping consensus

Information technology is clear from Fig 2C (and Supplementary Fig S3) that consensus OORs are consistently amongst the largest possible OORs beyond our set of musical excerpts, just they are not always the largest. However, it is not uncommon for the beat in a given piece of music to be perceived in different ways. More oft than not, listeners volition exhibit a variety of tapping patterns, for example with some tapping twice as fast or one-half as fast every bit others, or 180 degrees out of phase with others. Additionally, if the beat is not very salient, in that location will be doubtfulness almost when exactly a beat occurs and therefore an increased variance in observed inter-tap-intervals. In such cases, and indeed in the dataset we employ, listeners brandish a range of perceived trounce interpretations, and what we have termed the consensus beat is merely the beat out interpretation that happens to be favored by a (sometimes narrow) majority of listeners. This variability is illustrated in Fig 3, where for some excerpts borer beliefs was consistent across a large majority of listeners (e.g. Fig 3A), and for others borer behavior was more variable, indicating a less salient or more ambiguous vanquish percept (eastward.yard. Fig 3B–3C; run into Supplementary Fig S5 and S6 for tapping behavior for all excerpts).

Fig 3.

Download figure
Open up in new tab

Fig 3. A glimpse into the variability across homo listeners tapping to the beat in music.

(A) Top: Raster plot of tap times for the 40 human annotators across the 10 southward excerpt of an example song. Each row is one subject, and location along the x-axis represents when the subject tapped during the x s musical extract. Consensus beat times are marked by gray vertical lines (encounter Methods). Note that most subjects' taps line upward in time with each other and with consensus beat for this example excerpt. Bottom: Tap density estimates based on tap times pooled beyond subjects, binned with 2 ms bins, and smoothed with a Gaussian kernel with a standard deviation of 5% of the consensus beat period (blue). Shown in red is a smoothed tap density judge of the "ideal" tap histogram (with realistic motor error) that would have been obtained if all subjects had tapped on every consensus beat (see Methods). The correlation between real and idealized density is high for this excerpt (r=0.88), indicating a stiff borer consensus. (B) Aforementioned as A, but for a musical excerpt with multiple minority beat interpretations and therefore a lower correlation coefficient (r=0.78). (C) Same as B, but where the tapping consensus is fifty-fifty weaker (r=0.59). See Supplementary Figs S5 and S6 for all musical excerpts.

Nonetheless, if we hypothesize that a large OOR predisposes a listener to hear a detail vanquish estimation, then we would predict that the excerpts that evoke the largest OORs in cortical responses should too be the ones that evoke the clearest, most unambiguous vanquish percept across listeners. Can the variability in tapping beliefs be explained past the size of OORs in the auditory cortex?

To respond this question, we quantified the strength of the tapping consensus for each song by computing the correlation coefficient betwixt the smoothed histogram of observed tap times and the smoothed histogram of the "platonic" case in which all xl listeners would have tapped on each consensus beat out within a realistic caste of sensory or motor error (see Methods). Examples of observed (blue) and idealized (ruddy) tap density estimates are shown the lower panels of Fig iii.

Consistent with our hypothesis, the size of the consensus OOR evoked in the auditory cortex past a musical excerpt correlated significantly with the strength of the tapping consensus beyond listeners (Fig 4A; p<0.001, Pearson correlation, N = 20 songs). Neither OOR (p=0.48) nor consensus strength (p=0.44) varied with the consensus tempo of musical excerpts (Pearson correlation, N = xx songs). Fig 4B and 4C show how OOR and consensus strength, respectively, develop over the course of the 10 s duration of the musical excerpts. Data were split into five 2-s chunks, and OORs and correlation coefficients were calculated based on the information in each chunk. Borer consensus force, which is depression initially, is nearly at ceiling from about four s into the excerpts, indicating that listeners only needed a few seconds to find the shell. OORs, on the other mitt, did not change systematically over time, suggesting that the correspondences observed in this study between neural activeness and behavior are unlikely to be due to cortical entrainment or buildup in neural responses.

Fig 4.

Download figure
Open up in new tab

Fig four. The stronger the on-vanquish neural emphasis, the stronger the tapping consensus.

(A) Each dot is one musical excerpt. There is a stiff correlation between auditory cortical OOR (x-axis) and the tapping consensus beyond listeners, quantified every bit described in Fig iii (y-axis; p<0.001, Pearson correlation, N = xx songs). (B) Borer consensus, calculated for each sequential two s segment of musical excerpts. Colored lines are private songs. In blackness is the mean beyond songs for each time chunk ± standard deviation. (C) Aforementioned as panel B but for OOR values.

Spectrotemporal receptive field based models explicate nearly 90% of the variance in OOR

The vanquish-related processing observed in the rat auditory cortex may be due to beat-specific processes, or, as nosotros hypothesized might exist more likely, due to the spectrotemporal tuning properties of recorded units. If this were the case, neural responses predicted using a standard linear-nonlinear (LN) model fitted to each unit should largely reproduce observed OORs. To examination this, nosotros starting time estimated each unit'south spectrotemporal receptive field (STRF), or the linear model that describes the frequency and timing properties of incoming sounds that would either excite or inhibit a neuron. Next, we estimated the unit's static sigmoid output nonlinearity to arrive at a fitted LN model for each unit of measurement (come across Methods). The LN model was fitted twenty times for each unit, each time using that unit of measurement's responses to 19 of the musical excerpts while setting aside one extract as a test song. This ensured that predicted neural responses for a test vocal were truthful predictions since the model was not trained on the test extract. In this manner, firing charge per unit predictions were generated for each unit of measurement and each musical excerpt, and these were then analyzed to arrive at predicted OOR values.

An STRF from an example unit is shown in Fig 5A, with frequency on the y-centrality and stimulus history on the x-axis. This unit shows a preference for frequencies at and above 16 kHz, and is excited if sounds in that frequency range were heard 25 ms ago simply inhibited if they occurred 40 ms agone. A short excerpt from a examination song is shown in Fig 5B, where it can exist seen that LN model predictions are in good agreement with observed firing rates. Fig 5C shows consensus OOR values for each musical extract based either on observed (x-axis) or predicted (y-axis) firing rates. The LN model slightly underestimates OORs (p<0.001, Wilcoxon paired signed-rank test, North = 20 songs), suggesting that there is some nonlinear procedure that slightly increases OOR beyond processes captured by a standard LN model. However, despite this minor difference, the LN model successfully accounts for 89% of the variance in OOR values for the tested musical excerpts (p<10⁻⁶, Pearson correlation, N = xx songs). Predictions fabricated using the linear STRF solitary (without the static nonlinearity) deemed for 61% of the variance in OOR (p<0.01, Pearson correlation, North = 20 songs).

Fig 5.

Download figure
Open in new tab

Fig 5. Cortical firing rate predictions based on fitted linear-nonlinear (LN) models incorporating spectrotemporal receptive fields (STRFs).

(A) STRF from an instance unit of measurement, with frequency on the y-centrality and time on the x-axis and color representing the coefficients. This unit shows a classic design of excitation and inhibition in a relatively narrow frequency range. Convolving this filter with the spectrogram of a sound stimulus, then applying a static nonlinearity, would result in the LN model's prediction of this unit'south firing rate over fourth dimension. (B) Measured (blue) and LN model predictions (ruby-red) of the population firing rate for a ane s segment of an example musical excerpt. Greyness vertical lines marker consensus tap times in this segment. (C) Observed (x-axis) versus predicted (y-axis) consensus on:off-beat ratios for each song. LN models account for 89% of the variance in OOR.

Word

The aim of this report was to explore how firing rate transients in the auditory cortical representation of music might set the stage for the perception of musical vanquish. Our results, based on the twenty musical excerpts that were diverse in tempo and genre, revealed that population firing rates were on boilerplate higher on the beat than off the crush, and that big on:off-beat ratios (OORs) were a distinguishing characteristic of the trounce interpretations most normally tapped past man listeners. While small differences between on-trounce and off-beat responses were already present in auditory nerve model responses, these differences were substantially amplified in auditory cortical responses. Furthermore, musical excerpts that evoked larger OORs in the auditory cortex also showed stronger tapping consensus among listeners. Finally, the spectrotemporal receptive field (STRF) properties of cortical units were able to business relationship for the magnitude of the OOR each musical extract would induce. Together, these findings suggest that large OORs in the auditory cortex, which arise due to the spectrotemporal tuning properties of neurons, may be cardinal to establishing the location and clarity of the perceived trounce.

Information technology is worth noting is the extent to which the physiology corresponded to borer beliefs and the extent to which standard LN STRF models could capture the physiology for existent musical excerpts. These observations strongly suggest that the related low-level mechanisms of neuronal accommodation¹⁰, amplitude modulation tuning³⁰, and STRFs play a determinative role musical in beat perception. This is not inconsistent with the theory that the consecration of the beat percept is the result of an interaction between "lesser-up" sensory processes and "top-down" cognitive ones³¹. Our data suggest that beat perception may actually begin weakly at the ear, with neural activity showing stronger correspondences to beliefs equally information ascends through the brainstem and primary cortical structures of the ascending auditory pathway^32,33. Since these parts of the ascending auditory arrangement are often highly conserved across mammalian species^34–37, cantankerous-species investigations may be a promising manner to empathize the neural signals and dynamics that underlie beat induction, which to date remain mysterious.

Though our results point that beat perception is strongly influenced past basic physiological mechanisms and therefore only partly culturally adamant, they exercise not imply that "lesser-up" processes could possibly explain everything. For example, some well-studied constraints on-beat out perception include the tendency to perceive a crush inside a frequency range of roughly 0.5–four Hz³⁸ with a special preference for ii Hz³⁹, and an overall preference for binary (e.g. 2, iv) meters over ternary (e.g. 3, half dozen) or other complex meters^38,40. These constraints are probable driven by top-down influences or may upshot from auditory-motor interactions^8,17–25 and are unlikely to exist explained by lesser-upwards sensory processing solitary. Furthermore, the perceived beat and its neural signatures can be modulated at will by summit-down attention or mental imagery of beat structure^12,41–43. Bringing these ideas together, we suggest that the perception of beat relies on the application of learned and implicit rhythmic priors^vi,7 onto an ascending sensory representation^ten,thirty with a bias towards configurations that maximize the difference between neural activity on and off the beat.

That we see as much correspondence as nosotros practice between the representation in auditory cortex and trounce perception could exist an indication that neural activity in the auditory cortex is a key interface between the sensory and motor and/or cerebral processes involved in shell perception. Probing the cortico-basal ganglia-thalamo-cortical loop⁴⁴ may be a promising avenue for future investigations. Projections from auditory cortical fields to the basal ganglia accept been well-characterized⁴⁵, and the basal ganglia in humans have been repeatedly implicated in crush perception^22,43,46,47 every bit well as other auditory cerebral abilities⁴⁸. We speculate that large firing rate transients in the auditory cortex, observed in this report to co-occur with the perceived beat, could gear up into motion the dynamics of this loop and thereby enable the possible entrainment of cortical oscillations to the beat^{21,42,49–51}. We advise circumspection, however, as there is currently some debate around what constitutes neural entrainment to auditory rhythms^52–54, and whether frequency-domain representations of rhythms and brain signals necessarily reflect beat out perception⁵⁵.

The extent of the correspondence observed in this study between auditory cortical activeness in rats and man crush perception as well invites the intriguing question of whether rodents as well can perceive musical crush. Preliminary bear witness suggests that rats can be trained to discriminate isochronous rhythms from non-isochronous ones⁵⁶. Mice besides appear capable of performing a synchronization-continuation task, and in that report, primary auditory cortex was implicated every bit being necessary for the generation of anticipatory motor actions⁵⁷. These studies at minimum advise that rodents accept the capacity to perceive temporal structure and execute motor actions timed to an external isochronous rhythm. Future behavioral studies are needed to explore the limits of sensorimotor synchronization in rodents.

At the other end of the spectrum are humans, whose ability to synchronize with an external rhythm, whether information technology is to a metronome or to the beat in music, is spontaneous¹, highly anticipatory⁵⁸, innate⁵⁹, and often involuntary^two,sixty,61. The gradual audiomotor evolution hypothesis posits that the ability to entrain movements to musical beat out relies on potent coupling between the auditory and motor systems, and that the neurophysiology and behavioral capacity to do so evolved gradually²⁰. This hypothesis is supported by bear witness that nonhuman primates, like humans, are capable of producing tempo-flexible anticipatory movements in time with a metronome^62,63 and tin can detect rhythmic groupings, but cannot discover or synchronize to a musical trounce^four. The dissociation between perceiving auditory rhythms and perceiving musical beat may relate to findings that distinct networks underpin "elapsing-based" and "beat-based" temporal predictions^64–66. It is important for futurity studies in the area of shell perception to be clear nigh precisely what is being perceived, since at that place is demonstrable nonequivalence between the detection of a pulse in isochronous rhythms, a pulse in real music, and beat in the context of the different levels of nested hierarchical structure present in music, the latter of which has arguably non yet been demonstrated in whatsoever nonhuman species⁶⁷.

This leads to the question of why beat perception exists in the showtime place. Some clues might be found in parallels that beat perception has with other abilities, peculiarly with the human capacity for language^68–70. Another possibility is that trounce may provide a way to apace assess locomotion speed from the sound of a circuitous gait. Though this speculation has not yet been tested directly, gait studies have shown that humans are able to assess a number of attributes of a walker based merely on their walking sounds, including gender, posture, and emotional country^71,72.

However, at the heart of these complex abilities are neural circuits that are very former and besides underlie more general auditory cerebral abilities⁷³ such as perception of fourth dimension⁷⁴ and prediction of future sensory inputs⁷⁵. Therefore, a unified perspective that would bring all of this together is that the information processing performed by the auditory system upward to chief auditory cortex is largely consistent beyond most mammals, just the complexity of the operations the organism ecologically needs to perform with this information may exist the determinant for what is "pinnacle-down." Our information suggest that strong firing rate transients in the neural representation of real music may shape where the beat is felt, and while an on-beat neural emphasis is certainly not the whole story, it is a lead worth exploring further. Ultimately, this work underscores the importance of low-level auditory processing in creating a representation of audio where certain features are emphasized based on temporal context, a representation on which other high-level processes rely to give ascension to circuitous perception.

Methods

Stimuli

The 20 songs tested were the training dataset for the MIREX 2006 beat out tracking algorithm competition⁷⁶. Each song had vanquish annotations collected from twoscore human listeners²⁸. Only the first 10 s of songs and beat annotations were used in this written report.

Surgical Protocol

All procedures were canonical and licensed by the UK dwelling office in accord with governing legislation (ASPA 1986). Iii female person Lister Hooded rats weighing approximately 250 grams were anesthetized with an intraperitoneal injection of 0.05 ml domitor and 0.ane ml ketamine. To maintain anesthesia, a saline solution containing 16 ug/kg/h domitor, four mg/kg/h ketamine, and 0.v mg/kg/h torbugesic were infused continuously during recording at a charge per unit of 1 ml/h. A craniotomy was performed 4.vii mm caudal to bregma and extending 3.5 mm lateral from the midline on the right hand side.

Recordings were made using a 64 channel silicon probe (Neuronexus Technologies, Ann Arbor, MI, The states) with 175 umⁱⁱ recording sites arranged in a square filigree pattern at 0.2 mm intervals along eight shanks with eight channels per shank. The probe was inserted into the auditory cortex in a medio-lateral orientation wherever possible.

The xx songs were played in randomized order for a total of 12 repeats, with three seconds of silence separating each song from the next. Stimuli were presented binaurally through headphones at eighty dB SPL. Sounds were presented with a sampling charge per unit of 48828.125 Hz, and data were acquired at a sampling charge per unit of 24414.0625 Hz using a TDT system 3 recording setup (Tucker Davis Technologies).

Data Analysis

Borer Analysis

To calculate consensus tap times, the histogram of tap times, pooled beyond the forty subjects and and so binned using 2 ms bins, was smoothed using a Gaussian kernel with a width (standard deviation) of twoscore ms. This width was chosen because visual inspection of tap histograms showed the standard deviation around taps to exist approximately 40 ms, so a Gaussian kernel with that width would approximate a "matched filter." The precise width of the smoothing kernel was non critical to our results as long as it roughly matched the spread in the data. A peak-finder (findpeaks.grand, congenital-in Matlab office) was then used to identify peaks that were larger than 40% of the maximum value in the smoothed histogram. The consensus inter-tap-interval (ITI) for a vocal was taken to be the mean interval between successive peaks, after the exclusion of intervals larger than 1.5 times the median inter-peak-interval (which would happen if the peak-finder missed a tiptop). The consensus phase was adamant by finding the beginning that optimally aligned a temporal filigree with consensus ITI spacing with the peaks institute by the peak-finder. Consensus tap times can exist described by a consensus ITI (beat period) and consensus first (vanquish stage) combination for each song.

On-beat out neural activity was defined every bit the boilerplate population firing charge per unit in the 100 ms following consensus tap times, and off-beat neural activity was the average population firing rate during all time excluding these on-beat windows. The justification for this definition is that (i) the true perceived beat location is almost certainly only later a listener taps, given the well documented tendency of listeners to anticipate the trounce with their movements by several tens of milliseconds (negative beat out asynchrony)⁶¹, (2) defining off-beat activeness as all neural activeness that is non on the beat is consistent with previous piece of work¹⁰, and (iii) an interval of 100 ms is less than one half a trounce bike for the fastest shell flow observed in these data of 273 ms. The precise choice of time window is non critical, and this was confirmed past running all analyses using on-beat windows that ranged between forty ms and 120 ms in 10 ms increments. The results were entirely consistent with those presented here for a time window of 100 ms, and if anything, slightly stronger when shorter fourth dimension windows were used.

To compute the strength of the consensus, an "ideal tap histogram" was synthetic by bold all twoscore listeners tapped precisely at each consensus beat time as adamant by the excerpt's consensus ITI and phase. A realistic degree of motor mistake was added by convolving this with a Gaussian kernel whose width was 5% of the beat period. The same v% Gaussian kernel was so used for kernel density estimation on the ii signals: the raw pooled histogram of (measured) tap times that already contained motor error, and the idealized tap histogram with motor error added. The choice of temporal filter value was guided by the magnitude of errors reported in studies of human sensorimotor synchronization^77–79, but other kernel widths close to v% also produce consistent results. The correlation coefficient betwixt real and arcadian tap density estimates for a given musical excerpt was taken as a measure of the strength of the tapping consensus, where a big value would indicate a high degree of similarity betwixt real and "platonic" tapping beliefs. Estimation of the real and arcadian tap densities is also possible using a constant width (east.g. xl ms) Gaussian kernel rather than a proportional one. However, while doing so would atomic number 82 to the to the same main upshot shown in Fig 4, this measure out of borer consensus strength would have the undesirably effect of as well correlating with song tempo since, as mentioned above, information technology is well-established that the magnitude of sensorimotor synchronization errors calibration with interval duration.

Electrophysiology Data Preprocessing

Offline spike sorting and clustering was done on the raw data using an automated expectation-maximization algorithm (Spikedetekt/Klustakwik)^lxxx, and clusters were manually sorted using Klustaviewa (Cortical Processing Lab, University College London). Firing rates over fourth dimension for multi-units were calculated by binning spike times into 5 ms bins, which resulted in peri-stimulus fourth dimension histograms (PSTHs) at an constructive sampling rate of 200 Hz.

To determine whether spikes were reliably stimulus-driven, a noise power to betoken power cutoff of 40 was called⁸¹. Song 1 was arbitrarily called to the exist the stimulus for which the repeatability of responses was measured. Units that failed to show a noise power to signal power ratio less than 40 based on the 12 repeats were excluded from further analysis, leaving a full of 98 multi-units. All subsequent analyses were performed using custom-written Matlab lawmaking.

Plumbing equipment the LN Model

The relevant scripts used at all stages of this process are available on Github⁸². Outset, music stimuli were transformed into a simple approximation of the activity pattern received past the auditory pathway by calculating the log-scaled spectrogram ('cochleagram')^82–84. For each audio, the power spectrogram was taken using ten ms Hanning windows, overlapping by v ms. The power across neighboring Fourier frequency components was and so aggregated using overlapping triangular windows comprising 27 frequency channels with middle frequencies ranging from 50 Hz to 20,319 Hz (1/3 octave spacing). Next, the log was taken of the power in each fourth dimension-frequency bin, and finally whatsoever values below a low threshold were set to that threshold. These calculations were performed using lawmaking adapted from melbank.m (http://www.ee.ic.ac.great britain/hp/staff/dmb/voicebox/voicebox.html). The STRF model was trained to predict the firing rate at time t from a snippet of the cochleagram extending 100 ms (20 time bins) back in time from time t. The linear weights describing the firing rate of each neuron were estimated past regressing, with elastic net regularization, each neuron's firing charge per unit at each time point against the 100 ms cochleagram snippet directly preceding it. Regularization strength was set by using a randomly called ten% of time bins from the cantankerous-validation set as a validation set, and and then by choosing the regularization parameters that led to the fit on the validation set with the everyman mean squared error. A sigmoidal nonlinearity⁸⁵ was then fitted to map from the linear activation to the predicted PSTH such that information technology minimized the fault between the predicted PSTH and the observed PSTH. LN model predictions of a unit's PSTH to a test song were made by first convolving the cochleagram of the exam song with the linear STRF and then applying the nonlinearity. Each unit's LN model was calculated 20 times, each fourth dimension setting a dissimilar vocal aside as the test set. This was done and then that PSTH predictions for any musical excerpt were truthful predictions since that excerpt was not included in the training set for the model.

References

one.↵

Van Dyck, E. et al. Spontaneous Entrainment of Running Cadency to Music Tempo. Sports Medicine - Open 1, 15–14 (2015).
2.↵

Repp, B. H. & Su, Y.-H. Sensorimotor synchronization: A review of recent research (2006–2012). Psychon Balderdash Rev twenty, 403–452 (2013).
3.

Schachner, A. , Brady, T. F. , Pepperberg, I. Thousand. & Hauser, Grand. D. Spontaneous Motor Entrainment to Music in Multiple Vocal Mimicking Species. Electric current Biology 19, 831–836 (2009).
4.↵

Honing, H. , Merchant, H. , Háden, 1000. P. , Prado, L. & Bartolo, R. Rhesus Monkeys (Macaca mulatta) Detect Rhythmic Groups in Music, but Not the Vanquish. PLoS One vii, e51369 (2012).
v.↵

London, J. Cognitive Constraints on Metric Systems: Some Observations and Hypotheses. Music Perception: An Interdisciplinary Journal xix, 529–550 (2002).
vi.↵

Drake, C. & Ben El Heni, J. Synchronizing with music: intercultural differences. Ann. North. Y. Acad. Sci. 999, 429–437 (2003).
seven.↵

Jacoby, N. & McDermott, J. H. Integer Ratio Priors on Musical Rhythm Revealed Crossculturally by Iterated Reproduction. Electric current Biology 27, 359–370 (2017).
eight.↵

Big, Due east. Westward. , Herrera, J. A. & Velasco, M. J. Neural Networks for Beat Perception in Musical Rhythm. Frontiers in Systems Neuroscience 9, 583 (2015).
9.↵

Todd, N. P. M. & Lee, C. S. The sensory-motor theory of rhythm and beat out induction twenty years on: a new synthesis and futurity perspectives. Front Hum Neurosci 9, 357 (2015).
10.↵

Rajendran, V. G. , Harper, Northward. S. , Garcia-Lazaro, J. A. , Lesica, N. A. & Schnupp, J. W. H. Midbrain adaptation may gear up the stage for the perception of musical crush. Proc. Biol. Sci. 284, 20171455 (2017).
11.↵

Fujioka, T. , Trainor, 50. J. , Large, E. W. & Ross, B. Beta and Gamma Rhythms in Human Auditory Cortex during Musical Beat out Processing. Ann. N. Y. Acad. Sci. 1169, 89–92 (2009).
12.↵

Iversen, J. R. , Repp, B. H. & Patel, A. D. Top-Down Command of Rhythm Perception Modulates Early Auditory Responses. Ann. North. Y. Acad. Sci. 1169, 58–73 (2009).
13.

Fujioka, T. , Ross, B. & Trainor, L. J. Beta-Band Oscillations Correspond Auditory Vanquish and Its Metrical Hierarchy in Perception and Imagery. J Neurosci 35, 15187–15198 (2015).
14.

Doelling, Yard. B. & Poeppel, D. Cortical entrainment to music and its modulation by expertise. Proc. Natl. Acad. Sci. United states of americaA. 112, E6233–42 (2015).
xv.

Snyder, J. Southward. & Large, E. Westward. Gamma-band activeness reflects the metric construction of rhythmic tone sequences. Cognitive Encephalon Inquiry 24, 117–126 (2005).
16.↵

Zanto, T. P. , Snyder, J. S. & Large, East. W. Neural correlates of rhythmic expectancy. Advances in Cognitive Psychology 2, 221–231 (2006).
17.↵

Schroeder, C. East. , Wilson, D. A. , Radman, T. , Scharfman, H. & Lakatos, P. Dynamics of Active Sensing and perceptual selection. Electric current Opinion in Neurobiology 20, 172–176 (2010).
18.

Patel, A. D. & Iversen, J. R. The evolutionary neuroscience of musical beat perception: the Activeness Simulation for Auditory Prediction (ASAP) hypothesis. Frontiers in Systems Neuroscience (2014). doi: 10.3389/fnsys.2014.00057/abstract
xix.

Morillon, B. , Hackett, T. A. , Kajikawa, Y. & Schroeder, C. East. ScienceDirect Predictive motor control of sensory dynamics in auditory active sensing. Current Opinion in Neurobiology 31, 230–238 (2015).
20.↵

Merchant, H. & Honing, H. Are non-human primates capable of rhythmic entrainment? Show for the gradual audiomotor evolution hypothesis. Front end Neurosci seven, (2014).
21.↵

Tal, I. et al. Neural Entrainment to the Beat: The 'Missing-Pulse' Phenomenon. J. Neurosci. 37, 6331–6341 (2017).
22.↵

Grahn, J. A. The Role of the Basal Ganglia in Shell Perception. Ann. N. Y. Acad. Sci. 1169, 35–45 (2009).
23.

MacDougall, H. G. & Moore, S. T. Marching to the beat of the same drummer: the spontaneous tempo of man locomotion. J. Appl. Physiol. 99, 1164–1173 (2005).
24.

Maes, P.-J. , Leman, M. , Palmer, C. & Wanderley, M. M. Activeness-based effects on music perception. Front Psychol 4, 1008 (2014).
25.↵

Zatorre, R. J. , Chen, J. 50. & Penhune, V. B. When the brain plays music: auditory|[ndash]motor interactions in music perception and product. Nat Rev Neurosci 8, 547–558 (2007).
26.↵

Geiser, E. , Notter, Yard. & Gabrieli, J. D. E. A corticostriatal neural organization enhances auditory perception through temporal context processing. J. Neurosci. 32, 6177–6182 (2012).
27.↵

Patel, A. D. The Evolutionary Biology of Musical Rhythm: Was Darwin Wrong? PLoS Biol 12, e1001821–half dozen (2014).
28.↵

McKinney, M. F. & Moelants, D. Ambiguity in Tempo Perception: What Draws Listeners to Unlike Metrical Levels? Music Perception: An Interdisciplinary Journal 24, 155–166 (2006).
29.↵

Zilany, M. Southward. A. , Bruce, I. C. & Carney, L. H. Updated parameters and expanded simulation options for a model of the auditory periphery. J. Acoust. Soc. Am. 135, 283–286 (2014).
30.↵

Zuk, Northward. J. , Carney, L. H. & Lalor, E. C. Preferred Tempo and Low-Audio-Frequency Bias Emerge From Simulated Sub-cortical Processing of Sounds With a Musical Beat. Forepart Neurosci 12, 349 (2018).
31.↵

Honing, H. , Bouwer, F. L. & Háden, G. P. in Neurobiology of Interval Timing 829, 305–323 (Springer New York, 2014).
32.↵

Smith, P. H. & Spirou, Thousand. A. in Integrative Functions in the Mammalian Auditory Pathway 15, 6–71 (Springer, New York, NY, 2002).
33.↵

Malmierca, M. S. in Encyclopedia of Computational Neuroscience 155–186 (Springer, New York, NY, 2015). doi: ten.1007/978-ane-4614-6675-8_286
34.↵

Dean, I. , Harper, N. S. & McAlpine, D. Neural population coding of audio level adapts to stimulus statistics. Nat Neurosci 8, 1684–1689 (2005).
35.

Dean, I. , Robinson, B. 50. , Harper, N. S. & McAlpine, D. Rapid Neural Accommodation to Sound Level Statistics. J Neurosci 28, 6430–6438 (2008).
36.

Ingham, N. J. & McAlpine, D. Spike-Frequency Adaptation in the Junior Colliculus. J Neurophysiol 91, 632–645 (2004).
37.↵

Bulkin, D. A. & Groh, J. Chiliad. Systematic mapping of the monkey inferior colliculus reveals enhanced low frequency sound representation. J Neurophysiol 105, 1785–1797 (2011).
38.↵

London, J. Hearing in Time. (Oxford Academy Press, 2012). doi: ten.1093/acprof:oso/9780199744374.001.0001
39.↵

van Noorden, L. & Moelants, D. Resonance in the Perception of Musical Pulse. Journal of New Music Research 28, 43–66 (1999).
40.↵

Large, East. West. in Psychology of Time (2008).
41.↵

Large, East. West. & Jones, Grand. R. The dynamics of attending: How people track fourth dimension-varying events. Psychol Rev 106, 119–159 (1999).
42.↵

Nozaradan, S. , Peretz, I. , Missal, M. & Mouraux, A. Tagging the Neuronal Entrainment to Vanquish and Meter. J Neurosci 31, 10234–10240 (2011).
43.↵

Chapin, H. L. et al. Neural Responses to Complex Auditory Rhythms: The Role of Attending. Front Psychol 1, (2010).
44.↵

Parent, A. & Hazrati, 50.-Northward. Functional beefcake of the basal ganglia. I. The cortico-basal ganglia-thalamo-cortical loop. Brain Inquiry Reviews 20, 91–127 (1995).
45.↵

Reale, R. A. & Imig, T. J. Auditory cortical field projections to the basal ganglia of the cat. Neuroscience viii, 67–86 (1983).
46.↵

Grahn, J. A. & Rowe, J. B. Finding and feeling the musical beat: striatal dissociations betwixt detection and prediction of regularity. Cereb. Cortex 23, 913–921 (2013).
47.↵

Lewis, P. A. , Wing, A. M. , Pope, P. A. , Praamstra, P. & Miall, R. C. Encephalon activity correlates differentially with increasing temporal complication of rhythms during initialisation, synchronisation, and continuation phases of paced finger borer. Neuropsychologia 42, 1301–1312 (2004).
48.↵

Kotz, South. A. , Schwartze, Yard. & Schmidt-Kassow, One thousand. Not-motor basal ganglia functions: A review and proposal for a model of sensory predictability in auditory language perception. Cortex 45, 982–990 (2009).
49.↵

Stupacher, J. , Woods, G. & Witte, Yard. Neural Entrainment to Polyrhythms: A Comparing of Musicians and Non-musicians. Forepart Neurosci eleven, 1664 (2017).
l.

Lehmann, A. , Arias, D. J. & Schönwiesner, M. Tracing the neural basis of auditory entrainment. Neuroscience 337, 306–314 (2016).
51.↵

Zoefel, B. , Oever, X, S. & Sack, A. T. The Involvement of Endogenous Neural Oscillations in the Processing of Rhythmic Input: More Than a Regular Repetition of Evoked Neural Responses. Front end Neurosci 12, 95 (2018).
52.↵

Obleser, J. , Henry, Yard. J. & Lakatos, P. What exercise we talk about when nosotros talk well-nigh rhythm? PLoS Biol fifteen, e2002794 (2017).
53.

Breska, A. & Deouell, Fifty. Y. Trip the light fantastic to the rhythm, cautiously: Isolating unique indicators of oscillatory entrainment. PLoS Biol 15, e2003534 (2017).
54.↵

Zhou, H. , Melloni, L. , Poeppel, D. & Ding, North. Interpretations of Frequency Domain Analyses of Neural Entrainment: Periodicity, Cardinal Frequency, and Harmonics. Front Hum Neurosci 10, 509–8 (2016).
55.↵

Henry, G. J. , Herrmann, B. & Grahn, J. A. What tin we larn about crush perception by comparing encephalon signals and stimulus envelopes? PLoS ONE 12, e0172454–17 (2017).
56.↵

Celma-Miralles, A. & Toro, J. Thousand. Trounce Perception in a Not-Vocal Learner: Rats Can Identify Isochronous Beats. Proceedings of the th International Conference on the Evolution of Linguistic communication one–2 (2018).
57.↵

Li, J. et al. Primary Auditory Cortex is Required for Anticipatory Motor Response. Cereb. Cortex i–eighteen (2017). doi: 10.1093/cercor/bhx079
58.↵

Aschersleben, G. Temporal Control of Movements in Sensorimotor Synchronization. Encephalon and Cognition 48, 66–79 (2002).
59.↵

Winkler, I. , Háden, G. P. , Ladinig, O. , Sziller, I. & Honing, H. Newborn infants detect the beat in music. Proc. Natl. Acad. Sci. United states of americaA. 106, 2468–2471 (2009).
sixty.↵

Repp, B. H. Perception of timing is more context sensitive than sensorimotor synchronization. Perception & Psychophysics 64, 703–716 (2002).
61.↵

Repp, B. H. Sensorimotor synchronization: a review of the tapping literature. Psychon Bull Rev 12, 969–992 (2005).
62.↵

Takeya, R. , Kameda, Thou. , Patel, A. D. & Tanaka, M. Predictive and tempo-flexible synchronization to a visual metronome in monkeys. Scientific Reports 7, 6127 (2017).
63.↵

Gámez, J. et al. Predictive rhythmic borer to isochronous and tempo changing metronomes in the nonhuman primate. Ann. N. Y. Acad. Sci. 1423, 396–414 (2018).
64.↵

Grube, M. , Cooper, F. E. , Chinnery, P. F. & Griffiths, T. D. Dissociation of duration-based and shell-based auditory timing in cerebellar degeneration. Proc Natl Acad Sci U.s. 107, 11597–11601 (2010).
65.

Merchant, H. , Zarco, W. , Pérez, O. , Prado, L. & Bartolo, R. Measuring time with different neural chronometers during a synchronization-continuation task. Proc Natl Acad Sci USA 108, 19784–19789 (2011).
66.↵

Breska, A. & Deouell, 50. Y. Neural mechanisms of rhythm-based temporal prediction: Delta stage-locking reflects temporal predictability but non rhythmic entrainment. PLoS Biol 15, e2001665–30 (2017).
67.↵

Fitch, Due west. T. Rhythmic noesis in humans and animals: distinguishingmeter and pulse perception. 1–16 (2013). doi: ten.3389/fnsys.2013.00068/abstract
68.↵

Hauser, K. D. & McDermott, J. The development of the music faculty: a comparative perspective. Nat Neurosci 6, 663–668 (2003).
69.

Fitch, W. T. The biology and evolution of music: A comparative perspective. Cognition 100, 173–215 (2006).
70.↵

Jackendoff, R. Parallels and Nonparallels between Language and Music. Music Perception: An Interdisciplinary Journal 26, 195–204 (2009).
71.↵

Tajadura-Jiménez, A. et al. As Calorie-free every bit your Footsteps. in 2943–2952 (ACM Press, 2015). doi: 10.1145/2702123.2702374
72.↵

Visell, Y. et al. Audio design and perception in walking interactions. Journal of Human being Computer Studies 67, 947–959 (2009).
73.↵

Rajendran, V. G. , Teki, S. & Schnupp, J. W. H. Temporal Processing in Audience: Insights from Music. Neuroscience (2017). doi: ten.1016/j.neuroscience.2017.10.041
74.↵

Buhusi, C. V. & Meck, W. H. What makes us tick? Functional and neural mechanisms of interval timing. Nat Rev Neurosci 6, 755–765 (2005).
75.↵

Singer, Y. et al. Sensory cortex is optimized for prediction of future input. eLife 1–31 (2018). doi: ten.7554/eLife.31557.001
76.↵

McKinney, M. F. , Moelants, D. , Davies, M. E. P. & Klapuri, A. Evaluation of Sound Trounce Tracking and Music Tempo Extraction Algorithms. Journal of New Music Inquiry 36, 1–16 (2007).
77.↵

Peters, Thousand. The relationship between variability of intertap intervals and interval elapsing. Psychological Research 51, 38–42 (1989).
78.

Thaut, M. H. , Miller, R. A. & Schauer, L. Thousand. Multiple synchronization strategies in rhythmic sensorimotor tasks: stage vs period correction. Biol Cybern 79, 241–250 (1998).
79.↵

Ivry, R. B. & Hazeltine, R. E. Perception and production of temporal intervals across a range of durations: evidence for a common timing mechanism. J Exp Psychol Hum Percept Perform 21, 3–xviii (1995).
eighty.↵

Kadir, S. Due north. , Goodman, D. F. & Harris, M. D. High-dimensional cluster analysis with the Masked EM Algorithm. arXiv preprint arXiv: 1309.2848 (2013).
81.↵

Linden, One thousand. S. J. F. How linear are auditory cortical responses? xv, 125 (2003).
82.↵
83.

Ben D B Willmore , Schoppe, O. , Male monarch, A. J. , Schnupp, J. Westward. H. & Harper, North. South. Incorporating Midbrain Adaptation to Hateful Sound Level Improves Models of Auditory Cortical Processing. J Neurosci 36, 280–289 (2016).
84.↵

Harper, North. S. et al. Network Receptive Field Modeling Reveals Extensive Integration and Multi-feature Selectivity in Auditory Cortical Neurons. PLoS Comput Biol 12, e1005113–xxx (2016).
85.↵

Rabinowitz, Northward. C. , Willmore, B. D. B. , Schnupp, J. W. H. & King, A. J. Spectrotemporal Contrast Kernels for Neurons in Primary Auditory Cortex. J Neurosci 32, 11271–11284 (2012).

constablelesellizen66.blogspot.com

Source: https://www.biorxiv.org/content/10.1101/481473v1.full