巴西专利BR112012026984B1 apparatus and method for modifying an incoming audio signal

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
APPARATUS AND METHOD FOR MODIFYING AN INCOMING AUDIO SIGNAL An apparatus for modifying an incoming audio signal comprises an excitation meter, a storage device and a signal modifier. The excitation determiner determines a value of an excitation parameter from a subband of a plurality of subbands of the input audio signal based on an energy content of the subband. In addition, the storage device stores a look-up table that contains a plurality of spectral weighting factors. A spectral weighting factor from the plurality of spectral weighting factors is associated with a preset value of the excitation parameter and a subrange from the plurality of subranges. The storage device provides a spectral weighting factor corresponding to the determined value of the excitation parameter and corresponding to the sub-range, for which the excitation parameter value is determined.
公开号:BR112012026984B1
申请号:R112012026984-4
申请日:2011-04-20
公开日:2021-06-08
发明作者:Christian Uhle；Jürgen Herre；Oliver Hellmuth；Stefan Finauer
申请人:Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V.；
IPC主号:

专利说明:

DESCRIPTION
Embodiments according to the invention relate to audio signal processing and particularly to an apparatus and method for modifying an input audio signal.
There have been many attempts to develop a satisfactory objective method of measuring noise. Fletcher and Munson determined in 1933 that human hearing is less sensitive at low and high frequencies than at mid (or voice) frequencies. They also found that the relative change in sensitivity reduced as the sound level increased. A previous noise meter consisted of a microphone, amplifier, meter and a combination of filters designed to roughly mimic the listening frequency response at low, mid and high sound levels.
Although these devices provided a constant-level, single-tone noise measurement, the more complex sound measurements did not match subjective noise impressions very well. Sound level meters of this type have been standardized but are only used for specific tasks such as monitoring and controlling industrial noise.
In the early 1950s, Zwicker and Stevens, among others, extended the work of Fletcher and Munson in developing a more realistic model of the noise perception process. Stevens published a method for "Calculation of the Loudness of Complex Noise" in the Journal of the Acoustical
Society of America in 1956, and Zwicker published his article "Psychological and Methodical Basis of Loudness" in Acoustica in 1958. In 1959, Zwicker published a graphical procedure for calculating noise, as well as several similar files 5 briefly thereafter. Stevens and Zwicker methods have been standardized as ISO 532, parts A and B (respectively). Both methods involve similar steps.
First, the time-varying distribution of energy along the basilar membrane of the inner ear, referred to 10 as excitation, is simulated by passing audio through band-pass auditory filters with evenly spaced center frequencies on a critical band rate scale. . Each ear filter is designed to simulate the frequency response at a particular location along the basilar membrane of the inner ear, with the center frequency of the filter corresponding to that location. A critical range span is defined as the range span of this one filter. Measured in Hertz units, the critical range amplitude of these hearing filters increases with increasing center frequency. It is therefore useful to define a curved frequency scale 20 so that the critical range amplitude for all hearing filters measured on this curved scale is constant. This curved scale is referred to as the critical range rate scale and is very useful in understanding and simulating a wide variety of psychoacoustic phenomena. See, for example, 25 Psychoacoustics-Facts and Models by E. Zwicker and H. Fasti, Springer-Verlag, Berlin, 1990. The methods of Stevens and Zwicker use a critical range rate scale referred to as the Bark scale, in which the critical band amplitude is constant, below 500 Hz, and increases above 500 Hz. More recently, Moore and Glasberg defined a critical band rate scale, which they named the Equivalent Rectangular Band Amplitude (ERB) scale ( BCJ Moore, B. Glasberg, T. Baer, "A Model for the Prediction of Thresholds, Loudness, and Partial Loudness", Journal of the Audio Engineering Society, Vol. 45, No. 4, April 1997, pp. 224-240 ) . Through psychoacoustic experiments using tone noise maskers, Moore and Glasberg demonstrated that the critical range amplitude continues to decrease below 500 Hz, unlike the Bark scale, in which the critical range amplitude remains constant.
The term "critical track" returns to the work of Harvey Fletcher in 1938 on the masking of sound sensation by accompanying signals ("J.B. Allen, "A short history of telephone psychophysics," Audio Eng. Soc. Convention, 1997"). Critical ranges can be expressed using the Bark scale proposed by Zwicker in 1961: each critical range has the amplitude of a Bark (a unit named after Heinrich Barkhausen) . Along the filter banks that mimic human auditory perception there are, for example, the Rectangular Range Amplitude (ERB) scale ("BCJ Moore, BR Glasberg and T. Baer, "A model for the prediction of thresholds, loudness, and partial loudness", J. Audio Eng. Soc., 1997").
The term "specific noise" describes the noise sensation caused by a signal in a certain region of the basilar membrane at a certain frequency range amplitude measured in critical ranges. It is measured in Sone/Bark units. The term "critical range" refers to the frequency ranges of an auditory filterbank comprising non-uniform passband filterbanks designed to mimic the frequency resolution of human hearing. The overall noise of a sound equals the sum/integral of the specific noise across all critical ranges.
A method for processing an audio signal was described in "AJ Seefeldt, "Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal", US Patent 2009/0097676, 2009. audio signal specific noise control, with applications to volume control, dynamic range control, dynamic equalization and base noise compensation. In this document, an input audio signal (usually in the frequency domain) is modified, de so that its specific noise matches a specific target noise.
To illustrate the processing advantage, as presented in “AJ Seefeldt, “Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal.” US Patent 2009/0097676, 2009, considers volume control Changing the level of an audio signal in sound reproduction is usually aimed at altering its perceived noise. Stated differently, noise control is traditionally implemented as sound level control. psychoacoustic knowledge indicates that this is not ideal.
The sensitivity of human hearing varies in both frequency and level, so that a reduction in the level of sound intensity attenuates the sensation of low and high frequencies (for example, about 100 Hz and 10,000 Hz, respectively) more than the sensation of middle frequencies (eg between 2000 and 4000 Hz) . By lowering the reproduction level from a "noisy comfortable" level (eg 75-80 dBA) to a lower level, for example by 18 dB, the perceived spectral balance of the audio signal changes. This is illustrated in the well-known Equal Noise Contours, often referred to as Fletcher-Munson Curves (after the researchers who first measured Equal Noise Contours in 1933) . Equal Noise Contour displays the sound pressure level (SPL) over the frequency spectrum, for which a listener perceives constant noise when presented with pure fixed tones.
The contours of Equal Noise are depicted, for example, in "BCJ Moore, BR Glasberg and T. Baer, "A model for the prediction of thresholds, loudness, and partial loudness", J. Audio Eng. Soc., 1997). p. 232, Figure 13". A revised measurement was standardized as ISO 226:2003 in 2003.
Consequently, conventional noise control not only changes the noise, but also the timbre. The impact of this effect is SPL dependent (it is less pronounced when changing the SPL, for example, from 86 dBA to 68 dBA, compared to a change from 76 dBA to 58 dBA), but this is not desired in all classes.
This is offset by processing, as described in "A.J. Seefeldt, "Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal." US Patent 2009/0097676, 2009".
Figure 7 presents a flowchart of a method 700 described in "A.J. Seefeldt, "Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal". US Patent 2009/0097676, 2009".
The output signal is processed by calculating 710 the excitation signal, calculating 720 the specific noise, calculating 730 the target specific noise, calculating 740 the target excitation signal, calculating 750 the spectral weights and applying 760 the spectral weights to the input signal and synthesize the output signal again.
The H spectral weights are frequency range weights that depend on the specific noise of the input signal and the specific target noise. Its calculation, as described in "AJ Seefeldt, "Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal. US Patent 200.9/0097 67 6, 2009)", comprises the calculation of Specific noise is the inverse process of calculating the specific noise, which is applied to the target specific noise.
Both processing steps introduce a high computational load. Methods for calculating the specific noise were presented in "E. Zwicker, H. Fasti, U. Widmann, K. Kurakata, S. Kuwano and S. Namba, "Program for calculating loudness according to DIN 45631 (ISO 532 B) ", J. Acoust. Soc. Jpn. (E), vol. 12, 1991" and "B. C.J. Moore, B.R. Glasberg and T. Baer, "A model for the prediction of thresholds, loudness, and partial loudness", J. Audio Eng. Soc., 1997".
It is the aim of the present invention to provide an improved concept for modifying audio signals to allow an efficient implementation with low computational complexity.
This object is solved by an apparatus according to claim 1, or a method according to claim 20. An embodiment of the invention provides an apparatus for modifying an input audio signal comprising an excitation determiner, an input device. storage and a signal modifier. The excitation determiner is configured to determine a value of an excitation parameter of a subband of a plurality of subbands of the input audio signal based on an energy content of the subband signal. The storage device is configured to store a look-up table containing a plurality of spectral weighting factors, wherein a spectral weighting factor of the plurality of spectral weighting factors is associated with a predefined excitation parameter value and a subrange of the plurality. of sub-bands. Furthermore, the storage device is configured to provide a spectral weighting factor corresponding to the determined value of the excitation parameter and corresponding to the sub-range for which the value of the excitation parameter is determined. The signal modifier is configured to modify a subband content of the input audio signal, for which the excitation parameter is determined, based on the spectral weighting factor provided to provide a modified subband.
The embodiments according to the present invention are based on the central idea that the sub-ranges of an audio signal can be easily modified using a look-up table containing spectral weighting factors, which can be chosen depending on the respective sub-range and parameter of excitation of the subband. For that, the look-up table contains spectral weighting factors for a plurality of predefined excitation parameter values for at least a predefined subrange of the plurality of subranges. By using the lookup table, computational complexity can be significantly reduced, since an explicit calculation of the spectral weighting factors (which includes the calculation of noise, its modification and the inverse processing of the computation of noise) is not necessary. Therefore, an efficient implementation is allowed.
In some embodiments, in accordance with the invention, the excitation determiner determines a value of an excitation parameter not for all subranges of the plurality of subranges. Furthermore, the lookup table contains only spectral weighting factors associated with the subbands, for which a value of an excitation parameter is determined. In this way, the required storage space of the lookup table and the computational effort for the excitation determinant can be reduced.
Some embodiments according to the invention relate to a look-up table comprising exactly three dimensions associated with predefined values of the excitation parameter, subranges of the plurality of subranges and predefined values of an external modification parameter.
Some further embodiments according to the invention relate to a look-up table comprising exactly four dimensions associated with predefined values of the excitation parameter, subranges of the plurality of subranges, predefined values of the external modification parameter and predefined values of a base noise parameter.
Embodiments in accordance with the invention will be detailed subsequently with reference to the accompanying drawings, in which: Figure 1 is a block diagram of an apparatus for modifying an input audio signal; Figure 2 is a schematic illustration of equal noise contours; Figure 3 is a schematic illustration of equal noise contours normalized by transmission filters; Figure 4 is a block diagram of an apparatus for modifying an input audio signal;
Figure 5 is a flowchart of a method for modifying an input audio signal; Figure 6 is a flowchart of a method for modifying an input audio signal; and Figure 7 is a flowchart of a known method for modifying an input audio signal. In the following, the same reference numerals are partially used for objects and functional units having the same or similar functional properties and their description in relation to one figure should also apply to other figures, in order to reduce redundancy in the description of embodiments. Figure 1 presents a block diagram of an apparatus 100 for modifying an input audio subband signal 102, in accordance with an embodiment of the invention. Apparatus 100 comprises an excitation determiner 110, a storage device 120 and a signal modifier 130. The excitation determiner 110 is connected to the storage device 120 and the storage device 120 is connected to the signal modifier 130. The determiner excitation parameter 110 determines a value 112 of an excitation parameter of a sub-range 102 of a plurality of sub-ranges of the input audio signal 102, based on an energy content of the sub-range 102. The storage device 120 stores a look-up table containing a plurality of spectral weighting factors, wherein a spectral weighting factor 124 of the plurality of spectral weighting factors is associated with a predefined value of the excitation parameter and the subrange of the plurality of subranges. Furthermore, the storage device 120 provides a spectral weighting factor 124 corresponding to the determined value 112 of the excitation parameter and corresponding to the sub-range 102, for which the value 112 of the excitation parameter is determined. Signal modifier 130 modifies a content of subband 102 of the input audio signal, for which the value 112 of the excitation parameter is determined, based on the spectral weighting factor 124 provided to obtain and provide a modified subband 132.
By using a look-up table to provide spectral weighting factors 124 to modify the input audio signal, computational complexity can be significantly reduced compared to known concepts.
The excitation determiner 110 determines a value 112 of an excitation parameter based on an energy content of the sub-range 102. This can be done, for example, by measuring the energy content of a sub-range 102 to determine the value 112 of the parameter. excitation for sub-band 102. Thus, an excitation parameter can represent a measure for an energy per sub-band or a short-term energy in a specific sub-band, since the energy content can vary in time and/or between different subtracks. Alternatively, the value of the excitation parameter can be determined based on a function (unique, injective, bijective) of the short-term energy of a subrange (eg, an exponential function, a logarithmic function, or a linear function). For example, a quantification function can be used. In this example, the excitation determiner 110 can measure a sub-range energy content and can quantify the measured sub-range energy content to obtain the value of the excitation parameter such that the value of the excitation parameter is equal to a predefined value. of the excitation parameter. In other words, a measured energy value can be assigned a preset value of the excitation parameter (for example, the closest preset value of the excitation parameter). Alternatively, the value of the excitation parameter directly indicates the measured energy content and the storage device 120 can assign the determined value of the excitation parameter to a predefined value of the excitation parameter.
Subbands of the input audio signal can represent different frequency ranges of the input audio signal. To take into account a perceptual distribution of frequency bands, subbands can be distributed, for example, according to the ERB scale or the Bark scale or another frequency spacing that mimics the frequency resolution of the human ear. In other words, the subbands of the plurality of subbands of the input audio signal can be divided according to the ERB scale or the Bark scale.
The storage device 120 comprises an input for the excitation parameter (excitation signal) and stops a sub-range index which indicates the sub-range 102, for which the value 112 of the excitation parameter is determined. Alternatively, the storage device comprises one or more additional inputs for additional parameters.
Storage device 120 can be a digital storage medium, for example, a read-only memory (ROM), a hard disk, a CD, a DVD or any other type of non-volatile memory or random access memory (RAM). ).
The lookup table represents at least a two-dimensional array containing the plurality of spectral weighting factors. A spectral weighting factor 124 contained by the look-up table is unambiguously associated with a predefined value of the excitation parameter and a subrange of the plurality of subranges. In other words, each spectral weighting factor contained by the look-up table can be associated with a predefined excitation parameter value and a subrange of the plurality of subranges. Storage device 120 may provide a spectral weighting factor 124 associated with a preset value of the excitation parameter closest to the determined value 112 of the excitation parameter. Alternatively, for example, storage device 120 may linearly or logarithmically interpolate the two spectral weighting factors associated with the two preset values of the excitation parameter closest to the determined value 112 of the excitation parameter.
The predefined values of the excitation parameter can be linearly or logarithmically distributed.
The signal modifier 130 can, for example, amplify or attenuate the content of the subband 102, for which the value 112 of the excitation parameter is determined, by the spectral weighting factor provided 124.
By using the described concept, for example, a varying attenuation of the low, medium and high frequency human hearing sensation caused by an increase or decrease in the sound intensity level of an audio signal can be easily compensated for. For example, by lowering the reproduction level from one level to another level, the perceived spectral balance of the audio signal changes. This is illustrated in Figure 2 and Figure 3, they have equal noise contours. Especially, in the low frequency region, the contours of different equal noises are not parallel to each other. An amplification or attenuation of the low frequency ranges, different from the medium and/or high frequency ranges, can be equal to a curvature of the equal noise contours, so that they can be parallel or more parallel than before. In this way, the alteration of the perceived spectral balance can be compensated or almost compensated by using the described concept.
The difference between the equal noise contours of Figure 2 and the equal noise contours of Figure 3 is a normalization by a transmission filter. This transmission filter can simulate a filtering effect of audio transmission through the outer and inner ear. This transmission filter can be optionally implemented in an apparatus shown in Figure 1 to filter the incoming audio signal before providing it to the excitation determinant 110.
For more continuous modification of the input audio signal, the excitation determiner 110 can determine a value 112 of an excitation parameter for more than one subrange of the plurality of subranges. Then, storage device 120 can provide a spectral weighting factor 124 for each sub-range 102, for which a value 112 of an excitation parameter is determined, and signal modifier 130 can modify a content of each sub-range 102, to which a value 112 of an excitation parameter is determined, based on the respective spectral weighting factor provided corresponding 124 .
The plurality of subbands of the input audio signal can be provided by a memory unit or can be generated by an analysis filterbank.
An excitation parameter can be determined for a sub-band, for more than one sub-band or for all sub-bands of the plurality of sub-bands. For that, the apparatus 100 may comprise only one excitation determiner 110 which determines one, more than one or all of the excitation parameter values or may comprise an excitation determiner 110 for each subrange 102, for which a value 112 of one excitation parameter is determined. In addition, apparatus 100 may comprise one or more isolated modifier 130 for one or more sub-ranges, for which an excitation parameter is determined. However, it is sufficient to use a single look-up table (and storage device) for all subbands 102, for which a value 112 of an excitation parameter is determined.
The excitation determiner 110, storage device 120 and signal modifier can be independent hardware units, part of a computer, microcontroller or digital signal processor, as well as a computer program or a software product configured to run on a computer, microcontroller, or digital signal processor.
Figure 4 presents a block diagram of an apparatus 400 for modifying an input audio signal in accordance with an embodiment of the invention. Apparatus 400 is similar to the apparatus shown in Figure 1, but additionally comprises an analysis filterbank 410 and a synthesis filterbank 420. Analysis filterbank 410 separates the input audio signal into the plurality of subbands. Then, the excitation determiner 110 determines an excitation parameter value (calculates an aspect) for one or more subranges of the plurality of subranges. Thereafter, storage device 120 provides the one or more spectral weighting factors corresponding to one or more signal modifiers 130. Finally, synthesis filterbank 420 combines the plurality of subbands that contain at least one modified subband. to obtain and provide a modified audio signal (or output audio signal).
The example shown in Figure 4 can be an application of the proposed method to a generic case. The processing, as presented for the nth subband signal (nth subband), can be applied to all other subband signals (or only to all subbands, for which an excitation parameter value is determined) Similarly.
Optionally, a spectral weighting factor contained by the look-up table is further associated with a predefined value of an external modification parameter, as indicated by the dashed line in Figure 4 (but also applicable to the apparatus shown in Figure 1). The external modification parameter (or simply the modification parameter) can represent, for example, an input value from a user interface (for example, volume and/or ambient adjustments). Consequently, in that case, the storage device 120 can provide a spectral weighting factor corresponding to the value of the external modification parameter. For example, if a user increases or decreases the volume setting, the value of the external modification parameter changes and the storage device 120 can provide another corresponding spectral weighting factor. In summary, the storage device 120 can provide a spectral weighting factor corresponding to the determined value of the excitation parameter of a subband, corresponding to the subband, for which the excitation parameter value is determined, and corresponding to a value of the parameter of external modification.
In this example, the lookup table can comprise exactly three dimensions associated with the predefined values of the excitation parameter, associated with the subranges of the plurality of subranges and associated with the predefined values of the external modification parameter. This means that each spectral weighting factor contained by the lookup table is associated with a specific predefined value of the excitation parameter, a subrange of the plurality of subranges and a specific predefined value of the external modification parameter. In other words, the lookup table contains, for each combination of a predefined excitation parameter value, a subrange, and a predefined external modification parameter value, a spectral weighting factor. The predefined values of the external modification parameter can be distributed, for example, linearly or logarithmically, by means of a possible variation of the external modification parameter.
In addition, in some embodiments, a spectral weighting factor contained by the lookup table is also associated with a predefined value of a base noise parameter. The base noise parameter can represent the base noise level of the input audio signal. In this way, for example, a compensation for the partial masking effect of an audio signal in the presence of background noise can be performed. In that case, the storage device can provide a spectral weighting factor corresponding to a value of the noise-base parameter. This can be done in addition to or as an alternative to the above-mentioned consideration of the external modification parameter. If both are considered, the storage device can provide the spectral weighting factor corresponding to the determined value of the sub-range excitation parameter, corresponding to the sub-range, for which the excitation parameter is determined, corresponding to an external modification parameter value and corresponding to a base noise parameter value. In that case, the look-up table can comprise exactly four dimensions associated with the predefined values of the excitation parameter, associated with the subranges of the plurality of subranges, associated with the predefined values of the external modification parameter, and associated with the predefined values of the background parameter. The preset values of the base noise parameter can be distributed, for example, linearly or logarithmically in a possible variation of the base noise parameter.
A base noise parameter value can be determined by a base noise detector. This can be done for the entire input audio signal before splitting into sub-tracks or at sub-track level for one sub-track, for more than one sub-track or for all sub-tracks individually. Alternatively, if the plurality of sub-tracks of the input audio signal are stored and provided by a memory unit, the value of the base noise parameter can also be provided by the memory unit.
In any case, the storage device does not comprise an input for a specific noise parameter or a target specific noise parameter, although these spectral weighting factors contained by the look-up table can be calculated based on a specific noise parameter or a target specific noise parameter. The calculation of the spectral weighting factors can be done externally and they can be stored by the storage device after that. Therefore, the computational complexity of a device realized, according to the described concept, can be significantly reduced compared to known devices, since an explicit calculation of the spectral weighting factor is not necessary.
Spectral weighting factors can be calculated to be stored by the storage device, for example, as follows.
Audio processing can be performed in the digital domain. Likewise, the input audio signal can be denoted by the discrete time sequence x[n] that was sampled from the audio source at some sampling frequency fc. It can be assumed that the sequence x[n] has been properly scaled, so that the rms energy of x[n] in decibels is given by
is equal to the sound pressure level in dB at which the audio is being heard by a human listener. Furthermore, the audio signal can be assumed to be monophonic for simplicity of exposure.
The input audio signal is applied to an analysis filterbank or filterbank function ("Analysis filterbank"). Each filter in the Analysis Filterbank is designed to simulate the frequency response at a particular location along the basilar membrane in the inner ear. The Filter Bank may include a set of linear filters whose band amplitude and spacing are constant on the Equivalent Rectangular Band Amplitude (ERB) frequency scale as defined by Moore, Glasberg, and Baer ("BCJ Moore, B. Glasberg , T. Baer, "A Model for the Prediction of Thresholds, Loudness, and Partial Loudness," supra").
Although the ERB frequency scale more closely matches human perception and exhibits improved performance in producing objective noise measurements that match subjective noise results, the Bark frequency scale can be employed with reduced performance.
For a center frequency f in hertz, the amplitude of an ERB band in hertz can be approximated as:

From this relationship, a curved frequency scale is defined so that, at any point along the curved scale, the corresponding ERB in units of the curved scale is equal to one. The function to span from the linear frequency in hertz to this ERB frequency scale is obtained by integrating the reciprocity of Equation 1:

It is also useful to express the transformation from the ERB scale back to the linear frequency scale when solving Equation 2a for f:
where and is in units of the ERB scale.
The analysis filterbank may include B auditory filters, referred to as subbands, at center frequencies fc[ 1 ] ... fc[B] evenly spaced along the ERB scale. More specifically,
where Δ is the desired ERB spacing of the analysis filters, and where fmin and fmax are the desired minimum and maximum frequencies, respectively. You can choose Δ =1, and taking into account the frequency variation in which the human ear is sensitive, you can adjust fmin = 50 Hz and fmax = 20,000 Hz. With these parameters, for example, the application of Equations 3a ac, B=40 hearing filters are produced.
The magnitude of frequency response of each auditory filter can be characterized by a rounded exponential function, as suggested by Moore and Glasberg. Specifically, the magnitude response of a filter with center frequency f[b] can be computed as:

The filtering operations of the Analysis Filterbank can be adequately approximated using a finite-length Discrete Fourier Transform, commonly referred to as the Short-Time Discrete Fourier Transform (STDFT), because it is believed that an implementation that performs the filters in the Audio signal sampling rate, referred to as full rate implementation, provides more temporal resolution than is needed for accurate noise measurements.
The STDFT of the x[n] input audio signal can be set to:
where k is the frequency index, t is the time block index, N is the DFT size, T is the hop size, and w[n] is a window of length N normalized so w-1

Note that variable t in Equation 5a is a discrete index that represents the STDFT block of time, as opposed to a measure of time in seconds. Each increment in t represents a jump of T samples along the signal x[n]. Subsequent references to index t assume this definition. While different parameter settings and window shapes can be used depending on the implementation details, for fs=44100 Hz, choosing N=2048, T=1024, and having w[n] to be a Hann window, provides a proper time balance and frequency resolution. The STDFT described above can be implemented more efficiently using the Fast Fourier Transform (FFT).
Instead of STDFT, Modified Discrete Cosine Transform (MDCT) can be used to implement the analysis filter bank. MDCT is a transform commonly used in perceptual audio encoders, MDCT of input audio signal x[n] can be given by:

Generally speaking, the size of the hop T is chosen to be exactly half the length of the transform N, so that perfect reconstruction of the signal x[n] is possible.
Analysis Filterbank outputs are applied to a transmit filter or transmit filter function ("Broadcast Filter") that filters each filterbank band according to the audio transmission through the outer and inner ear.
In order to compute the input audio signal noise, a measure of a short-term energy of the audio signals in each filter of the Analysis Filter Bank after applying the Transmission Filter is required. This measure of time and frequency variation is referred to as excitation. The short-term energy output of each filter in Analysis Filterbank a can be approximated in an Excitation Function E[b,t] by multiplying the filter responses in the frequency domain with the energy spectrum of the signal. input:
where b is the subband number, t is the block number and Hb[k] and P[k] are the frequency responses of the auditory filter and transmission filter, respectively, sampled at a frequency corresponding to the STDFT box index or MDCT k. It should be noted that forms for the magnitude response of auditory filters other than those specified in Equations 4a to ac can be used in Equation 7 to achieve similar results.
In short, the output of the Excitation Function is a frequency domain representation of the energy E in the respective ERB bands b per time period t.
For certain applications, it may be desirable to smooth the excitation E[b,t] before transforming it to specific noise. For example, smoothing can be performed recursively in a Smoothing function, according to the equation:
where the time constants in each band b are selected according to the desired application. In most cases, the time constants can be advantageously chosen to be proportional to the integration time of human noise perception within range b. Watson and Gengel performed experiments demonstrating that this integration time is within the range of 150-175 ms at low frequencies (125-200 Hz) and 40-60 ms at high frequencies ("Charles S. Watson and Roy W. Gengel, "Signal Duration and Signal Frequency in Relation to Auditory Sensitivity" Journal of the Acoustical Society of America, Vol. 46, No. 4 (Part 2), 1969, pp. 989-997").
In a conversion function ("Specific Noise"), each frequency range of the excitation can be converted into a specific noise component value, which is measured in Sound per ERB.
Initially, in specific noise computation, the excitation level in each band of E[b,t can be transformed into an equivalent excitation level in 1 kHz, as specified, for example, by the equal noise contours normalized by a filter transmission:
where TlkHz(E,f) is a function that generates the level at 1 kHz, which is equally noisy for the level E at frequency f. Transforming to equivalent levels at 1 kHz simplifies the following specific noise calculation.
Then, the specific noise in each band can be computed as:
where NNB[b,t] and NWB[b,t] are specific noise values based on a narrowband and wideband signal model, respectively. The value a[b,t] is an interpolation factor between 0 and 1, which is computed from the audio signal.
NWB[b,t] and NWB[b,t] narrowband and wideband specific noise values can be estimated from the transformed excitation using the exponential functions:
where TQlkHz is the excitation level at the silent threshold for a 1 kHz tone. Of the equal noise contours TQlkHz equals 4.2 dB. It is observed that both these specific noise functions are equal to zero when the excitation is equal to the silent threshold. For excitations greater than the silent threshold, both functions grow monotonically, with an energy law conforming to Stevens' law of intensity sensation. The exponent for the narrow-range function is chosen to be greater than the wide-range function, causing the narrow-range function to increase faster than the wide-range function. The specific selection of β exponents and G gains for the narrow-band and wide-band cases is chosen to match the experimental data on noise growth for tones and noises.
The specific noise can equal some small value instead of zero when the excitation is at the hearing threshold. The specific noise must then reduce monotonically to zero as the excitation decreases to zero. The rationale is that the hearing threshold is a probabilistic threshold (the point at which the tone is detected 50% of the time), and that a number of tones, each at threshold, presented together can form a sound that is more audible than any of the individual tones. If the specified noise is set to be zero when the excitation is at or below the limit, then an exclusive solution to solve the gain does not exist for excitations at or below the limit. If, on the other hand, the specific noise is defined to be monotonically increasing over all excitation values greater than or equal to zero, then there is an exclusive solution. Scaling given 10 noise greater than unity will always result in a gain greater than unity and vice versa. The specific noise functions in Equations 11a and 11b can be changed to have the desired property, according to:
where the constant À, is greater than one, the exponent q is less than one, and the constants K and C are chosen so that the specific noise function and its first derivative are continuous at the point:

Of the specific noise, the general or "total" noise L[t] is given by the sum of the specific noise over all bands b:
In a specific noise modification function ("Specific Noise Modification"), the target specific noise, referred to as can be calculated from the specific noise in various ways. As described in more detail below, a specific target noise can be calculated using a scale factor a, for example in the case of a volume control. See Equation 16 below and its associated description. In the case of automatic gain control (AGO) and dynamic range control (DRC), a specific target noise can be calculated using a ratio of the desired output noise to the input noise. See Equations 17 and 18 below and their associated descriptions. In the case of dynamic equalization, a specific target noise can be calculated using a relationship established in Equation 23 and its associated description. In this example, for each band b and each time interval t, a function that solves the gain takes as its input the smoothed excitation and the target specific noise and generates spectral weighting factors, also called G[b,t] gains, used subsequently to modify the audio. Letting function iVÍ*} represent the non-linear transformation from excitation to specific noise, so that

Gain solver finds G[b,t], so
The function that solves the gain determines variable gains in frequency and time (spectral weighting factors), which, when applied to the original excitation, result in a specific noise that, ideally, is equal to the desired target specific noise. In practice, the function that solves the gain determines variable gains in frequency and time which, when applied to the time domain version of the audio signal, results in the modification of the audio signal in order to reduce the difference between its specific noise and the target specific noise. Ideally, the modification is such that the modified audio signal has a specific noise which is an approximation if a dose of the target specific noise. The solution to Equation 14a can be implemented in a variety of ways. For example, if a closed-form mathematical expression for the inverse of the specific noise, represented by exists, then the gains can be computed directly by rearranging equation 14a:

Alternatively, if a closed-form solution to Φ'1i*} does not exist, an iterative approach can be employed, in which, for each iteration, equation 14a is validated using a current estimate of the gains. The resulting specific noise is compared to the desired target and gains are updated based on the error. If the earnings are properly updated, they will converge to the desired solution. As mentioned before, the target specific noise can be represented by a scaling of the specific noise:
Replacing equation 13 with 14c and then 14c with 14b produces an alternative expression of earnings:

The spectral weighting factors or calculated gains are stored in the storage device's lookup table.
In some embodiments, in accordance with the invention, the excitation determiner does not determine a value of an excitation parameter for all subbands of the plurality of subbands. In this case, it is sufficient that the look-up table contains only spectral weighting factors associated with the subbands, for which a value of an excitation parameter is determined. In this way, the storage device storage space required to store the lookup table can be significantly reduced.
Since the curvature of the equal noise contours, which must be compensated, is stronger for the lower frequencies (see Figures 2 and 3), it may be sufficient to compensate for a noise variation only for the low frequency sub-bands. Therefore, it can be useful to calculate excitation parameters and store spectral weighting factors for low frequency subbands. In contrast, for high frequency subbands, no value of an excitation parameter can be determined and no spectral weighting factors associated with the high frequency subbands can be stored. In other words, a sub-range, for which a value of an excitation parameter is determined, can comprise frequencies lower than a sub-range, for which no value of an excitation parameter is determined.
Also, it may not be necessary to modify the high frequency subbands. In other words, a content of a subrange may not be modified by the signal modifier, if the excitation determiner does not determine a value of an excitation parameter for that subrange. This can only be the case if no other parameter, such as an external modification parameter or base noise parameter, is taken into account.
Alternatively, a spectral weighting factor provided by the storage device can be used by the signal modifier for more than one subband. In other words, the signal modifier can modify a content of a subrange, for which no value of an excitation parameter is determined, based on a spectral weighting factor provided for a subrange, for which a value of a parameter of excitement is determined. Considering the behavior of the equal noise contours shown in Figures 2 and 3, it may be sufficient to modify high-frequency bands, according to the same spectral weighting factor. This spectral weighting factor can be the spectral weighting factor provided for the subband comprising the highest frequencies of all the subbands, for which a value of an excitation parameter is determined. More generally, the signal modifier can modify a content of a subband, for which no value of an excitation parameter is determined, based on the spectral weighting factor provided for a subband containing frequencies greater than all other subbands, for which a value of an excitation parameter is determined. For example, it may be sufficient that the excitation determiner determines the value of an excitation parameter only for 5 to 15 (or 2 to 20, 7 to 12 or only 5, 6, 7, 8, 9, 10, 11, 12 ) subranges of the plurality of subranges or only for less than a quarter, one-third, half or two-thirds of the subranges of the plurality of subranges. These subbands may comprise frequencies lower than all other subbands in the plurality of subbands. Furthermore, the signal modifier can modify contents of these subbands, according to spectral weighting factors provided by the storage device for these subbands.
For example, the Bark scale comprises 25 frequency bands and may be sufficient to modify the 7 bands of the lowest frequencies, since the slower frequency bands present the strongest deviation from the idle behavior. Alternatively, the lower ranges of the ERB scale can be modified. The remaining subbands of the plurality of subbands can remain unmodified, can be modified in accordance with an external modification parameter and/or a base noise parameter, or can be modified in accordance with the spectral weighting factor provided for a subband, to which a value of an excitation parameter is determined, containing frequencies greater than all other sub-bands, for which a value of an excitation parameter is determined.
Figure 5 presents a flowchart of a method 500 for modifying an input audio signal in accordance with an embodiment of the invention. Method 500 comprises determining 510 a value of an excitation parameter of a subband of a plurality of subbands of the input audio signal based on an energy content of the subband. Further, method 500 comprises providing 520 a spectral weighting factor corresponding to the determined value of the excitation parameter and corresponding to the sub-range, for which the value of excitation parameter is determined. The spectral weighting factor is stored in a lookup table containing a plurality of spectral weighting factors. A spectral weighting factor from the plurality of spectral weighting factors is associated with a predefined value of the excitation parameter and a subrange from the plurality of subranges. Finally, method 500 comprises modifying the sub-range, for which the excitation parameter value is determined, based on the spectral weighting factor provided to obtain and provide a modified sub-range.
In other words, method 500 comprises calculating 510 of an excitation signal, retrieving 520 spectral weights (spectral weighting factors) from the look-up table, and modifying 530 the output audio signal. Optionally, method 500 comprises resynthesizing the output audio signal (combining the subtracks to obtain a modified audio signal).
This can, for example, be a method for efficient and generic signal modification.
Still optionally, an external modification parameter can also be taken into account (indicated by the dashed line), as described above.
A further consideration of a sub-band level of background noise (a parameter of background noise) is mentioned by method 600 shown in Figure 6.
Some embodiments according to the invention relate to an efficient realization of perceptual processing of audio signals. The concept described refers to a flexible and highly efficient architecture for frequency selective audio signal modification and processing, which can easily incorporate the characteristics of psychoacoustic effects in its processing, without suffering the computational burden of explicit auditory modeling. As an example, the realization of a multi-band processor for perceptual noise control is considered, which is based on the presented architecture.
This can be an efficient realization of psychoacoustic noise control.
The processing described above is comparable to filtering the input signal with a characteristic filter that is controlled by the input level within each hearing frequency range. It can be implemented more efficiently.
Basically, the proposed method avoids the calculation of the specific noise and the corresponding inverse calculation and thus avoids computationally intensive processing steps at the expense of discretely increased memory requirements.
The efficient implementation can be implemented using a simple lookup table (LÜT), possibly with interpolation.
The LUT is computed by measuring the process of input values and output values implemented as described above. LUT has, for example, three dimensions. It produces a modified subband or a modified audio signal, given the input excitation, modification parameter and frequency range index.
For example, it can be efficiently implemented by recognizing that its functionality is dependent on the frequency range index only for the lower frequency ranges, for example, when using an auditory filter bank with a resolution corresponding to the Bark scale, the filter bank can have 25 passband filters. Storage of the transfer function in the LUT for the lowest 7 bands can only be sufficient, since for higher band indices, the same input-output ratio holds for the 7 band indices.
This efficient processing produces a volume control that is correct in a psychoacoustic sense. Other applications, namely dynamic range control and/or dynamic equalization, are derived with efficient processing as described above, such as by proper LUT indexing.
Finally, the background noise compensation (that is, the compensation for the partial masking effect of a signal from
audio in the presence of background noise) can be achieved by adding a fourth dimension to the LUT, representing the level of background noise. The block diagram of the proposed processing for noise compensation is illustrated in Figure 6.
Although the processing described so far is aimed at emulating a psychoacoustic noise scaling algorithm, the architecture described in Figure 1 or Figure 4 can produce a much richer spectrum of sound modifications than would be available with a psychoacoustic noise scaling algorithm due to your LUT. It can be made dependent on even more factors (eg a user preference adjustment, other time-varying factors, etc.). It can be freely "adjusted" according to the subjective listener's preference in addition to the features that are provided by a function given as a closed-form expression.
In short, the invention relates to a flexible and highly efficient architecture for frequency selective audio signal modification and processing, which can easily incorporate the characteristics of psychoacoustic effects in its processing, without suffering from the computational burden of explicit auditory modelling.
At a summary level, the proposed efficient processing comprises the following steps. Based on the input signal, one or more aspect values (including excitation parameter value) can be calculated for different frequency ranges (eg critical ranges). Based on these aspect values (and possibly other information), a table lookup is performed for each of these ranges of
frequency to determine one or several table output parameters (spectral weighting factors) for each frequency range. These table output parameters are then used to determine the modification (eg, multiplicative scaling) of the input signal in the corresponding frequency ranges.
The processing of audio signals in frequency bands usually involves the use of filter banks, that is, the input signal is divided into several frequency bands (sub bands) by an analysis filter bank, and the final output signal is obtained by feeding the modified underband signals into the synthesis filterbank. The analysis and synthesis filterbank combine to reconstruct the input timing signal perfectly or almost perfectly.
A typical number of frequency bands is between 4 and 40. Table lookup based on aspect values generally involves quantifying the aspect values into a limited set of values that can be used as an index lookup to the table. In addition, the lookup table size can be reduced by choosing a very coarse quantization step size and subsequently interpolating between adjacent (two or more) table output parameter values. In order to consider various input aspects for computing the parameter output values, a lookup table with several dimensions can be used, for example, modification factor LUT containing excitation idx (index), hue idx, idx frequency. In a very simple (and efficient) case, the output parameter values directly represent multiplication factors to be applied to the input subrange in order to determine the output subrange signals. This is shown, for example, in Figure 4.
Although some aspects of the described concept have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or an aspect of a method step. Similarly, features described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example, a floppy disk, a DVD, a Blu-Ray, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having readable control signals electronically stored in it, which cooperate (or are able to cooperate) with a programmable computer system so that the respective method is carried out. Therefore, the digital storage medium can be computer readable.
Some embodiments, in accordance with the invention, comprise a data loader having electronically readable control signals, which are capable of cooperating with a programmable computer system, so that one of the methods described herein is carried out.
Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operated to perform one of the methods when the computer program product is executed on a computer. The program code can, for example, be stored in a machine readable loader.
Other embodiments comprise the computer program for performing one of the methods described herein, stored in a machine readable loader.
In other words, an embodiment of the inventive method is therefore a computer program having program code for performing one of the methods described herein, when the computer program is executed on a computer.
A further embodiment of the inventive methods is therefore a data loader (or a digital storage medium, or a computer readable medium) comprising, recorded thereon, the computer program for carrying out one of the methods described herein.
A further embodiment of the inventive method is therefore a data stream or a sequence of signals representing the computer program for carrying out one of the methods described herein. The data stream or signal sequence can, for example, be configured to be transferred via a data communication connection, for example via the Internet.
A further embodiment comprises a processing means, for example a computer or a programmable logic device configured or adapted to carry out one of the methods described herein.
A further embodiment comprises a computer having the computer program installed therein to carry out one of the methods described herein.
In some embodiments, a programmable logic device (e.g., a programmable field gate array) can be used to perform some or all of the functionality of the methods described herein. In some embodiments, a programmable field gate array can cooperate with a microprocessor to perform one of the methods described herein. Generally speaking, the methods are preferably performed by any hardware device.
The embodiments described above are merely illustrative for the principles of the present invention. It is understood that modifications and variations to the provisions and details described herein will be apparent to those skilled in the art. Therefore, it is intended to be limited only by the scope of the impending patent claims and not by the specific details presented by way of description and explanation of the achievements herein.

权利要求:
Claims (20)
[0001]
1. APPARATUS (100) FOR MODIFYING AN AUDIO INPUT SIGNAL, characterized in that it comprises: an excitation determiner (110) configured to determine a value (112) of an excitation parameter of a sub-band (102) of a plurality of sub-bands of the input audio signal based on an energy content of the subband (102), wherein the value of the excitation parameter indicates a power of the audio signal in the subband or a short-time energy of the audio signal in the subband or a quantized value of the short-time energy of the audio signal in the subband; a storage device (120) that stores a look-up table containing a plurality of spectral weighting factors, wherein a spectral weighting factor of the plurality of spectral weighting factors is associated with a predefined excitation parameter value and a sub-range of the plurality of sub-ranges, wherein the storage device is configured to provide a spectral weighting factor (124) corresponding to the determined value (112) of the excitation parameter and corresponding to the sub-range (102), for which the value (112) of the excitation parameter is determined; and a signal modifier (130) configured to modify a subband content (102) of the input audio signal, for which the value (112) of the excitation parameter is determined, based on the spectral weighting factor provided (124 ) to provide a modified subband (132) by the multiplicative scale of the subband of the audio signal by the spectral weighting factor provided by the lookup table.
[0002]
APPARATUS according to claim 1, characterized in that the excitation determiner (110) is configured to determine a value (112) of an excitation parameter for more than one subrange (102) of the plurality of subranges, wherein the storage device (120) is configured to provide a spectral weighting factor (124) for each sub-range (102), for which the value (112) of the excitation parameter is determined, and in which the signal modifier (130) is configured to modify a content of each sub-band (102), for which a value (112) of the excitation parameter is determined, in the respective spectral weighting factor provided (124).
[0003]
Apparatus according to claim 1 or 2, characterized in that it further comprises: an analysis filterbank (410) configured to separate the input audio signal into the plurality of sub-tracks; and a synthesis filterbank (420) configured to combine the plurality of subtracks containing at least one modified subtrack (132) to provide a modified audio signal.
[0004]
4. APPARATUS according to claim 1, characterized in that each spectral weighting factor contained by the look-up table is associated with a predefined value of the excitation parameter and a sub-range of the plurality of sub-ranges.
[0005]
Apparatus according to claim 1, characterized in that the subbands of the plurality of subbands of the input audio signal are divided according to the ERB scale, the Bark scale or other frequency spacing that mimics the frequency resolution of the human ear.
[0006]
APPARATUS according to claim 1, characterized in that the excitation determiner (110) is configured to determine a value (112) of an excitation parameter not for all sub-ranges of the plurality of sub-ranges, and in which the table of query contains only spectral weighting factors associated with subbands, for which a value of an excitation parameter is determined.
[0007]
APPARATUS according to claim 6, in which a sub-range (102), for which a value (112) of an excitation parameter is determined, is characterized in that it comprises frequencies lower than a sub-range, for which no value of an excitation parameter is determined.
[0008]
8. APPARATUS according to claim 6, characterized in that a content of a sub-range of the input audio signal is not modified by the signal modifier (130), if the excitation determiner (110) does not determine a value (112) of an excitation parameter for that subrange.
[0009]
Apparatus according to claim 1, characterized in that the excitation determiner (110) is configured to determine a value (112) of an excitation parameter only for less than one third of the subranges of the plurality of subranges, and in which the signal modifier (130) is configured to modify a content of the subbands, for which a value of an excitation parameter is determined, based on the respective spectral weighting factor provided correspondingly, these subbands comprising frequencies lower than all the other subranges of the plurality of subranges, for which a value of an excitation parameter is determined.
[0010]
10. APPARATUS according to claim 1, characterized in that the signal modifier (130) is configured to modify a sub-band content, for which no value of an excitation parameter is determined, based on a weighting factor spectral (124) provided for a sub-range (102), for which a value (112) of an excitation parameter is determined.
[0011]
11. APPARATUS according to claim 10, characterized in that the signal modifier (130) modifies a sub-band content, for which no value of an excitation parameter is determined, based on a spectral weighting factor (124) provided for a sub-range (102), for which a value (112) of an excitation parameter is determined, containing frequencies greater than all other sub-ranges (102), for which a value (112) of an excitation parameter is determined.
[0012]
12. APPARATUS according to claim 1, characterized in that a spectral weighting factor contained by the look-up table is further associated with a predefined value of an external modification parameter, in which the storage device (120) is configured to provide a spectral weighting factor (124) corresponding to the determined value (112) of the excitation parameter of a sub-range (102), corresponding to the sub-range (102), for which the value (112) of the excitation parameter is determined, and corresponding to an external modification parameter value.
[0013]
APPARATUS according to claim 12, wherein the look-up table is characterized in that it comprises exactly three dimensions associated with the predefined values of the excitation parameter, the subranges of the plurality of subranges and the predefined values of the external modification parameter.
[0014]
14. APPARATUS according to claim 12, characterized in that the signal modifier (130) is configured to modify a content of a sub-band, for which no value of an excitation parameter is determined, based on a value of the parameter of external modification.
[0015]
15. APPARATUS according to claim 1, characterized in that a spectral weighting factor contained by the look-up table is further associated with a predefined value of a base noise parameter, wherein the storage device (120) is configured to provide a spectral weighting factor (124) corresponding to the determined value (112) of the sub-range excitation parameter (102), corresponding to the sub-range (102), for which the excitation parameter value (112) is determined, and corresponding to a value of the base noise parameter.
[0016]
16. APPARATUS according to claim 15, wherein the look-up table is characterized in that it comprises exactly four dimensions associated with the predefined values of the excitation parameter, the subranges of the plurality of subranges, the predefined values of the external modification parameter and the default values of the base noise parameter.
[0017]
Apparatus according to claim 1, wherein the storage device (120) is characterized in that it does not comprise input for a specific noise parameter or a target specific noise parameter.
[0018]
Apparatus according to claim 1, characterized in that the look-up table stored by the storage device (120) is the only look-up table of the apparatus for modifying the input audio signal.
[0019]
Apparatus according to claim 1, characterized in that the excitation determiner (110) is configured to measure an energy content of the sub-range (102) and configured to quantify the measured energy content of the sub-range to obtain the parameter value. excitation parameter so that the value of the excitation parameter is equal to a preset value of the excitation parameter.
[0020]
20. METHOD (500, 600) FOR MODIFYING AN INPUT AUDIO SIGNAL, characterized in that it comprises: determining (510) a value of an excitation parameter of a subband of a plurality of subbands of the input audio signal, on the basis of in a subband energy content, where the value of the excitation parameter indicates a power of the audio signal in the subband or a short-time energy of the audio signal in the subband or a quantized value of the energy in the subband. short audio signal time in the subband; provision (520) of a spectral weighting factor corresponding to the determined value of the excitation parameter and corresponding to the sub-range, for which the excitation parameter value is determined, wherein the spectral weighting factor is stored in a look-up table containing a plurality of spectral weighting factors, wherein a spectral weighting factor of the plurality of spectral weighting factors is associated with a predefined value of the excitation parameter and a subrange of the plurality of subranges; modification (530) of the subband, for which the excitation parameter value is determined, based on the spectral weighting factor provided to provide a modified subband, by multiplicative scaling of the subband of the audio signal by the given spectral weighting factor by the lookup table.

类似技术:

公开号 | 公开日 | 专利标题

BR112012026984B1|2021-06-08|apparatus and method for modifying an incoming audio signal

US10396738B2|2019-08-27|Methods and apparatus for adjusting a level of an audio signal

EP1629463B1|2007-08-22|Method, apparatus and computer program for calculating and adjusting the perceived loudness of an audio signal

EP2002429B1|2012-11-21|Controlling a perceived loudness characteristic of an audio signal

RU2434310C2|2011-11-20|Measuring loudness with spectral modifications

EP2002426A1|2008-12-17|Audio signal loudness measurement and modification in the mdct domain

EP1835487B1|2013-07-10|Method, apparatus and computer program for calculating and adjusting the perceived loudness of an audio signal

同族专利:

公开号 | 公开日

PL2381574T3|2015-05-29|

US20130046546A1|2013-02-21|

RU2573246C2|2016-01-20|

CN102986136A|2013-03-20|

AU2011244268B2|2014-07-24|

CA2796948A1|2011-10-27|

EP2381574A1|2011-10-26|

CN102986136B|2016-02-10|

ES2526761T3|2015-01-15|

MX2012012113A|2013-02-26|

JP5632532B2|2014-11-26|

HK1161443A1|2012-08-24|

US8812308B2|2014-08-19|

KR101469339B1|2014-12-04|

EP2381574B1|2014-12-03|

RU2012149697A|2014-05-27|

JP2013537726A|2013-10-03|

CA2796948C|2016-10-18|

KR20130008609A|2013-01-22|

WO2011131732A1|2011-10-27|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US4641361A|1985-04-10|1987-02-03|Harris Corporation|Multi-band automatic gain control apparatus|

US5255323A|1990-04-02|1993-10-19|Pioneer Electronic Corporation|Digital signal processing device and audio apparatus using the same|

JP3119677B2|1991-06-10|2000-12-25|ローム株式会社|Signal processing circuit|

JP3322479B2|1994-05-13|2002-09-09|アルパイン株式会社|Audio equipment|

US6041297A|1997-03-10|2000-03-21|At&T Corp|Vocoder for coding speech by using a correlation between spectral magnitudes and candidate excitations|

JP3505085B2|1998-04-14|2004-03-08|アルパイン株式会社|Audio equipment|

US6351529B1|1998-04-27|2002-02-26|3Com Corporation|Method and system for automatic gain control with adaptive table lookup|

US6029126A|1998-06-30|2000-02-22|Microsoft Corporation|Scalable audio coder and decoder|

JP4522509B2|1999-07-07|2010-08-11|アルパイン株式会社|Audio equipment|

DE60033826T2|1999-07-28|2007-11-08|Clear Audio Ltd.|AMPLIFICATION CONTROL OF AUDIO SIGNALS IN A SOUND ENVIRONMENT WITH THE HELP OF A FILTER BANK|

US7072477B1|2002-07-09|2006-07-04|Apple Computer, Inc.|Method and apparatus for automatically normalizing a perceived volume level in a digitally encoded file|

US8199933B2|2004-10-26|2012-06-12|Dolby Laboratories Licensing Corporation|Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal|

CN101421781A|2006-04-04|2009-04-29|杜比实验室特许公司|Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal|

DE102006047197B3|2006-07-31|2008-01-31|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Device for processing realistic sub-band signal of multiple realistic sub-band signals, has weigher for weighing sub-band signal with weighing factor that is specified for sub-band signal around subband-signal to hold weight|

JP4706666B2|2007-05-28|2011-06-22|日本ビクター株式会社|Volume control device and computer program|

PL2232700T3|2007-12-21|2015-01-30|Dts Llc|System for adjusting perceived loudness of audio signals|JP5101292B2|2004-10-26|2012-12-19|ドルビーラボラトリーズライセンシングコーポレイション|Calculation and adjustment of audio signal's perceived volume and / or perceived spectral balance|

CN103325380B|2012-03-23|2017-09-12|杜比实验室特许公司|Gain for signal enhancing is post-processed|

EP3547312A1|2012-05-18|2019-10-02|Dolby Laboratories Licensing Corp.|System and method for dynamic range control of an audio signal|

CN103730131B|2012-10-12|2016-12-07|华为技术有限公司|The method and apparatus of speech quality evaluation|

JP6129348B2|2013-01-21|2017-05-17|ドルビーラボラトリーズライセンシングコーポレイション|Optimization of loudness and dynamic range across different playback devices|

WO2014130585A1|2013-02-19|2014-08-28|Max Sound Corporation|Waveform resynthesis|

EP2959479B1|2013-02-21|2019-07-03|Dolby International AB|Methods for parametric multi-channel encoding|

CN107093991B|2013-03-26|2020-10-09|杜比实验室特许公司|Loudness normalization method and equipment based on target loudness|

EP2981910A1|2013-04-05|2016-02-10|Dolby Laboratories Licensing Corporation|Acquisition, recovery, and matching of unique information from file-based media for automated file detection|

CN105164918B|2013-04-29|2018-03-30|杜比实验室特许公司|Band compression with dynamic threshold|

EP3044786A1|2013-09-12|2016-07-20|Dolby Laboratories Licensing Corporation|Loudness adjustment for downmixed audio content|

EP3044876B1|2013-09-12|2019-04-10|Dolby Laboratories Licensing Corporation|Dynamic range control for a wide variety of playback environments|

CN105142067B|2014-05-26|2020-01-07|杜比实验室特许公司|Audio signal loudness control|

CN112185401A|2014-10-10|2021-01-05|杜比实验室特许公司|Program loudness based on transmission-independent representations|

EP3089364B1|2015-05-01|2019-01-16|Nxp B.V.|A gain function controller|

EP3171614B1|2015-11-23|2020-11-04|Goodix TechnologyCompany Limited|A controller for an audio system|

EP3459075A4|2016-05-20|2019-08-28|Cambridge Sound Management, Inc.|Self-powered loudspeaker for sound masking|

JP6844383B2|2017-03-31|2021-03-17|株式会社アドヴィックス|Vehicle braking device|

US10762910B2|2018-06-01|2020-09-01|Qualcomm Incorporated|Hierarchical fine quantization for audio coding|

US11205414B2|2019-02-15|2021-12-21|Brainfm, Inc.|Noninvasive neural stimulation through audio|

CN110010154B|2019-03-26|2021-04-09|北京雷石天地电子技术有限公司|Volume balancing method and device|

法律状态:
2017-10-10| B15I| Others concerning applications: loss of priority|

2017-11-14| B12F| Appeal: other appeals|

2019-12-10| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|

2021-03-30| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|

2021-06-08| B16A| Patent or certificate of addition of invention granted|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 20/04/2011, OBSERVADAS AS CONDICOES LEGAIS. PATENTE CONCEDIDA CONFORME ADI 5.529/DF |

优先权:

申请号 | 申请日 | 专利标题

EP10160679.6A|EP2381574B1|2010-04-22|2010-04-22|Apparatus and method for modifying an input audio signal|

EP10160679.6|2010-04-22|

PCT/EP2011/056355|WO2011131732A1|2010-04-22|2011-04-20|Apparatus and method for modifying an input audio signal|

[返回顶部]