巴西专利BR112013026850B1 AUDIO / SPEECH ENCODING AND DECODING APPLIANCES AND AUDIO / SPEECH DECODING AND DECODING METHODS

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
"DEVICE AND METHOD FOR PERFORMING HUFFMAN CODING". The present invention relates to a Hufman table design that can be done offline with a large database of record sequences. The extent of the quantification indices (or differential indices) for Huffman coding is identified. For each extension value, any input signal that has the same extension will be assembled and the probability distribution of each value of the quantification indices (or differential indices) within the extension will be calculated. For each extension value, a Hufman table is designed, according to the probability. And to improve the bit efficiency of Huffman coding, a device and methods for reducing the extent of quantification indices (or differential indices) are also introduced.
公开号:BR112013026850B1
申请号:R112013026850-6
申请日:2012-03-12
公开日:2021-02-23
发明作者:Zongxian Liu；Kok Seng CHONG；Masahiro Oshikiri
申请人:Panasonic Intellectual Property Corporation Of America；
IPC主号:

专利说明:

Technical Field
[0001] The present invention relates to an audio / speech encoding apparatus, an audio / speech decoding apparatus and to audio / speech encoding and decoding methods that use Huffman encoding. Background Technique
[0002] In signal compression, Huffman coding is widely used to encode an input signal using a variable length code table (VL) (Huffman table). Huffman coding is more efficient than fixed length (FL) coding for the input signal which has a statistical distribution that is not uniform.
[0003] In Huffman coding, the Huffman table is derived in a particular way based on the estimated probability of occurrence for each possible value of the input signal. During encoding, each value of the input signal is mapped to a particular variable length code in the Huffman table.
[0004] The encoding of the signal values, which are statistically more likely to occur, using relatively short VL codes (using relatively few bits) and the inverse encoding of the signal values, which are statistically less likely to occur, by using relatively long VL codes (using relatively more bits), the total number of bits used to encode the input signal can be reduced. Citation Listing
[0005] [Non-patent document 1] ITU-T Recommendation G.719 (06/2008) "Low-complexity full-band audio coding for high-quality conversation applications" Summary of the Invention Technical problem
[0006] However, in some applications, such as audio signal encoding, signal statistics can vary significantly from one audio signal set to another audio signal set. And even within the same audio signal set.
[0007] If the audio signal statistics vary dramatically from the statistics in the predefined Huffman table, the signal encoding cannot be done optimally. And what happens is that, to encode the audio signal which has different statistics, the consumption of bits by the Huffman encoding is much higher than the consumption of bits by the fixed-length encoding.
[0008] One possible solution is to include both Huffman encoding and fixed-length encoding in the encoding, and the encoding method that consumes the least bits is selected. An indication signal is transmitted to the decoder side to indicate which encoding method is selected on the encoder. This solution is used in a recently standardized ITU-T G.719 speech codec.
[0009] The solution solves the problem for some very extreme strings in which the Huffman encoding consumes more bits than the fixed length encoding. However, for the other input signals that have different statistics from the Huffman table, but that still select the Huffman coding, this solution is not yet ideal.
[00010] In the ITU-T G.719 speech codec, Huffman coding is used in coding the quantization indices of the norm factors.
[00011] The structure of G.719 is illustrated in Figure 1.
[00012] On the encoder side, the input signal sampled at 48 kHz is processed through a transient detector (101). Depending on the detection of a transient, a high frequency resolution or a low frequency resolution transform (102) is applied in the frame of the input signal. The spectral coefficients obtained are grouped within bands of unequal lengths. The norm for each band is estimated (103) and the resulting spectral envelope consisting of the norms for all bands is quantized and encoded (104). The coefficients are then normalized by the quantized norms (105). The quantized standards are further adjusted (106) based on an adaptive spectral weighing and used as input for the bit allocation (107). The normalized spectral coefficients are quantized by interlacing vector and coded (108) based on the bits allocated to each frequency band. The level of uncoded spectral coefficients is estimated, encoded (109) and transmitted to the decoder. Huffman coding is applied to the quantization indices for both coded spectral coefficients as well as for the coded standards.
[00013] On the decoder side, the transient indication is first decoded which indicates the frame configuration, that is, stationary or transient. The spectral envelope is decoded and the exact same bit, standard settings and bit allocation algorithms are used in the decoder to recomput the bit allocation that is essential for the decoding quantization indices of the normalized transform coefficients. After the decanting (112), the low frequency non-coded spectral coefficients (allocated zero bits) are regenerated using a spectral codebook, created from the received spectral coefficients (spectral coefficients with non-zero bit allocation ) (113). The noise level adjustment index is used to adjust the level of the regenerated coefficients. High frequency non-coded spectral coefficients are regenerated using broadband extension. The decoded spectral coefficients and the regenerated spectral coefficients are mixed and conducted to the normalized spectrum. The decoded spectral envelope is applied leading to the decoded full band spectrum (114). Finally, the inverse transform (115) is applied to recover the decoded signal from the time domain. This is done by applying the inverse discrete cosine transform and modified for stationary modes or the inverse of the higher temporal resolution transform for the transient mode.
[00014] In the encoder (104), the norm factors of the spectral sub-bands are scaled with a scalar, uniform and logarithmic quantizer with 40 3dB steps. The log book entries for the logarithmic quantizer are shown in Figure 2. As seen in the code book, the extent of the IQ-2.5 Q17-I standard factors is [2, 2], and the value decreases as the index increases.
[00015] The coding of quantization indices for norm factors is illustrated in Figure 3. There are a total of 44 sub-bands and correspondingly, 44 norm factors. For the first subband, the norm factor is quantized using the first 32 codebook entries (301), while other norm factors are scaled with the 40 codebook entries (302) shown in Figure 2. The quantization index for the first sub-band norm factor is directly encoded with 5 bits (303), while the indices for the other sub-bands are encoded by the differential encoding. The differential indices are derived using the formula as follows (304): [1] Diff_index (n) = Index (n) - Index (n-I) + 15 for n and [1.43] (Equation 1)
[00016] And the differential indices are encoded by two possible methods, fixed length encoding (305) and Huffman encoding (306). The Huffman table for the differential indices is shown in Figure 4. In this table, there are a total of 32 entries, from 0 to 31, which supply the possibilities of abrupt change of energy between neighboring sub-bands.
[00017] However, for an incoming audio signal, there is a physical phenomenon called audible masking. Sound masking occurs when the perception of one sound is affected by the presence of another sound. As an example, if there are two signals with similar frequencies existing at the same time: a powerful elevation at 1kHz and a low level tone at 1.1kHz, a lower level tone at 1.1kHz will be masked (will be inaudible) due to the existence of the powerful elevation at 1kHz.
[00018] The sound pressure level necessary to make the sound noticeable in the presence of another sound (masker) is defined as the masking limit in the audio encoding. The masking limit depends on the frequency and the sound pressure level of the masker. If two sounds have similar frequencies, the masking effect is large and the masking limit is also large. If the masquerade has a high level of sound pressure, it will have a strong masking effect over another sound and the masking limit will also be large.
[00019] According to the sound masking theory mentioned above, if a sub-band had a very high energy, it would have a great masking effect on other sub-bands, especially on its neighboring sub-bands. Thus, the masking limit for the other sub-bands, especially for the neighboring sub-band, is large.
[00020] If the sound component in the neighboring sub-band has small quantization errors (less than the masking limit), the degradation in the sound component in this sub-band cannot be perceived by the listeners.
[00021] It is not necessary to encode the normal factor with a very high resolution for this sub-band as long as the quantization errors are below the masked limit. Solution to the Problem
[00022] In the present invention, apparatus and methods that exploit the properties of the audio signal to generate Huffman tables and to select Huffman tables from a set of predefined tables during the encoding of the audio signal are provided.
[00023] In short, the properties of sound masking are exploited to decrease the extent of the differential indices, so that the Huffman table, which has fewer words in code, can be designed and used for coding. As the Huffman table has fewer code words, it is possible to design codes with a shorter length (consumption of fewer bits). With this, the total consumption of bits to encode the differential indexes can be reduced. Advantageous Effects of the Invention
[00024] By adopting Huffman codes, which consume fewer bits, the total consumption of bits to encode the differential indexes can be reduced. Brief Description of Drawings
[00025] Figure 1 illustrates the structure of ITU-T G.719; Figure 2 shows the code book for the quantification of norm factors; Figure 3 illustrates the process of quantizing and coding norm factors; Figure 4 shows the Huffman table used for coding indexes of norm factors; Figure 5 shows the structure that adopts the present invention; Figures 6A and 6B show examples of predefined Huffman tables; Figure 7 illustrates the derivation of the masking curve; Figure 8 illustrates how the extent of the differential indices is decreased; Figure 9 shows a flow chart of how the modification of the indexes is done; Figure 10 illustrates how the Huffman tables can be designed; Figure 11 illustrates the structure of embodiment 2 of the present invention; Figure 12 illustrates the structure of embodiment 3 of the present invention; Figure 13 illustrates the modifier 4 of the present invention; Figure 14 illustrates the mode 4 decoder of the present invention. Description of Modalities
[00026] The main principle of the invention is described in this section with the aid of figures 5 to 12. Those skilled in the art will be able to modify and adapt the present invention without departing from the spirit of the invention. Illustrations are provided to facilitate explanation. Mode 1
[00027] Figure 5 illustrates the invented codec, which comprises an encoder and a decoder that apply the scheme invented in Huffman's coding.
[00028] In the encoder shown in Figure 5, the energies of the sub-bands are processed through psychoacoustic modeling (501) to derive the Mask (n) masking limit. According to the derived Mask (n), the quantization indices of the norm factors for the sub-bands, whose quantization errors are below the masking limit, are modified (502) so that the extent of the differential indices can be smaller.
[00029] The differential indices for the modified indices are calculated according to the equation below: [2] Index_diff (n) = New_Index (n) - New_Index (n-1) + 15 for ne [1.43] (Equation 2 )
[00030] The extent of the differential indices for Huffman coding is identified as shown in the equation below (504). [3] Extension = [Min (Index_diff (n), Max (Index_diff (n))] (Equation 3)
[00031] According to the value of the extension, the Huffman table that is designed for the specific extension among a set of predefined Huffman tables is selected (505) for the encoding of the differential indexes (506). As an example, if among all the differential indices for the input table, the minimum value is 12 and the maximum value is 18, then the Extension is = [12,18]. The Huffman table designed for [12.18] is selected as the Huffman table for coding.
[00032] Huffman's set of predefined tables is designed (such details will be explained later) and arranged according to the extent of the differential indices. The indication signal to indicate the selected Huffman table and the coded indices are transmitted to the decoder side.
[00033] Another method for selecting the Huffman table is to calculate the entire bit consumption using each Huffman table and then select the Huffman table that consumes the least amount of bits.
[00034] As an example, a set of 4 predefined Huffman tables is shown in figures 6A and 6B. In this example, there are 4 predefined Huffman tables, covered by the extensions of [13.17], [12.18], [11.19] and [10.20] correspondingly. Table 6.1 shows the indication sign and the corresponding extension for the Huffman table. Table 6.2 shows the Huffman codes for all values in the range of [13.17]. Table 6.3 shows the Huffman codes for all values in the range of [12,18]. Table 6.4 shows the Huffman codes for all values in the range of [11.19]. Table 6.5 shows the Huffman codes for all values in the range of [10.20].
[00035] Comparing the Huffman code length in figures 6A and 6B with the original Huffman table shown in Figure 4, it can be seen that the Huffman code length for the same values consumes fewer bits. This explains how the bits are saved.
[00036] In the decoder shown in Figure 5, according to the indication signal, the corresponding Huffman table is selected (507) for the decoding of the differential indices (508). The differential indices are used to reconstruct the indices of the quantization norm factors according to the equation below: [4] Diff_index (n) = Index (n) + Index (nI) - 15 for ne [1.43] (Equation 4)
[00037] Figure 7 illustrates the derivation of the masking curve of the input signal. In principle, the energies of the sub-bands are calculated and with that, the energies and the masking curve of the input signal are derived. The derivation of the masking curve can use some existing technologies in the prior art such as the method of derivation of the masking curve in MPEG AAC codec.
[00038] Figure 8 illustrates how the extent of the differential indices is decreased. In principle, a comparison is made between the masking limit and the error energy of the subband quantization. For sub-bands whose quantization error energy is below the masking limit, their indices are modified to a value that is closer to the neighboring sub-band, however, the modification is guaranteed by the fact that the error energy of corresponding quantization does not exceed the masking limit, so that sound quality is not affected. After modification, the extent of the indexes can be decreased. This is explained below.
[00039] As shown in Figure 8, for sub-bands 0, 2 and 4, due to the fact that their quantization error energies are below the masking limit, their indices are modified to be closer to their neighboring indices.
[00040] The modification of the indexes can be done as explained below (using subband 2 as an example). As shown in Figure 2, the large index corresponds to the smaller energy, therefore the Index (1) is lower than the Index (2). The modification of the Index (2) is in fact done to decrease its value. This can be done as shown in Figure 9.
[00041] For sub-bands 1 and 3, due to the fact that their energies are above the masking limit, their indices are not changed. Thus, the differential indices are closer to the center. Using sub-band 1 as an example: [5] Index_diff (1) = Index (1) - Index (0) + 15 for ne [1.43] (Equation 5) [6] Novo_Diff_index (1) = New_index (1) - New_Index (0) + 15 for ne [1, 43] (Equation 6) [7] New_Index (1) - New_Index (0) <Index (1) - Index (0) New_Diff_Index (1) -15 < Diff_Index (1) - 15 (Equation 7)
[00042] In the present invention, the Huffman table model can be done offline with a large database of record sequences. The process is illustrated in Figure 10.
[00043] The energies of the sub-bands processed by psychoacoustic modeling (1001) to derive the mask masked limit (n). According to the derived Mask (n), the quantization indices of the norm factors for sub-bands whose quantization energy errors are below the masking limit are modified (1002) so that the extent of the differential indices can be smaller.
[00044] The differential indices for the modified indices are calculated (1003).
[00045] The extent of the differential indices for Huffman coding is identified (1004). For each extension value, the entire input signal that has the same extension will be assembled and the probability distribution of each value of the differential index within the extension is calculated.
[00046] For each extension value, a Huffman table is designed according to the probability. Some traditional Huffman table model methods can be used here to design the Huffman table. Mode 2
[00047] In this modality, a method that can maintain the bit economy, restoring, however, the differential indices in a value closer to the original value is introduced.
[00048] As shown in Figure 11, after the Huffman table was selected in 1105, the differential indices are calculated between the original quantization indices. The original differential indices and the new differential indices are compared to see if they are consuming the same bits in the selected Huffman table.
[00049] If they are consuming the same number of bits in the selected Huffman table, the modified differential indexes are re-stored in the original differential indexes. If they are not consuming the same number of bits, the code words in the Huffman table that are closest to the original differential indexes and which consume the same number of bits are selected as the restored indexes.
[00050] The merit of this modality is that the error of quantization of the norm factor can be smaller although the consumption of bits is the same as that of modality 1. Modality 3
[00051] In this modality, a method that avoids the use of the psychoacoustic model and that uses only a little bit of the energy proportion limit is introduced.
[00052] As shown in Figure 12, instead of using a psychoacoustic model to derive the masking limit. The subband energies and the predefined energy proportion limit are used to determine whether or not to modify the quantization index for the specific subband (1201). As shown in the equation below, if the energy ratio between the current sub-band and the neighboring sub-band is less than the limit, then the current sub-band is not considered as important, therefore, the quantization index of the sub- current band can be modified. [8] Energy (n) / Energy (n-1) <Limit & & Energy (n) / Energy (n + 1) <Limit (Equation 8)
[00053] The modification of the quantization index can be done as shown in the equation below:
where, NFnew_index (n) means the decoded norm factor for subband n using the modified quantization index, NFindex (n) means the decoded norm factor for subband n using the original quantization index, Energy (n-1) means energy for subband n-1, Energy (n) means energy for subband n, and Energy (n + 1) means energy for subband n + 1 .
[00054] The merit of this modality is that the very complex psychoacoustic modeling can be avoided. Mode 4
[00055] In this modality, a method that reduces the extension of the differential indices while being able to perfectly reconstruct the differential indices is introduced.
[00056] As shown in Figure 13, the differential indices are derived from the original quantization indices (1301) according to the equation below: [10] Index_diff (n) = Index (n) - Index (n-1) + 15 (Equation 10) where, Diff_index (n) means differential index for subband n; Index (n) means the quantization index for subband n; Index (n-1) means the quantization index for the subband n-1.
[00057] In order to reduce the extent of the differential indexes, a module is implemented to modify the values of some differential indexes (1302).
[00058] The modification is made according to the value of the differential index for a preceding sub-band and for a limit.
[00059] One way to modify the differential index (when n> 1) can be done as shown in the equation below, where the first differential index would not be modified in order to obtain the perfect reconstruction on the decoder side: [11] if Diff_index (n-1)> (15 + Limit), Diff_index_new (n) = Diff_index (n) + Diff_index (n-1) - (15 + Limit); or if Diff_index (n-1) <(15 - Limit), Diff_index_new (n) = Diff_index (n) + Diff_index (n-1) - (15 - Limit); or else Diff_index_new (n) = Diff_index (n); (Equation 11) where, n> 1, Diff_index (n) means differential index for subband n, Diff_index (n-1) means differential index for subband n-1, Diff_index_new (n) means the new index differential for subband n, and Limit means the value to check whether a change in the differential index should be made.
[00060] The reason why this modification can reduce the extent of the differential indices is explained below: for the audio / speech signal, it is true that the energy fluctuates from one frequency band to another frequency band. However, it is observed that, normally, there is no abrupt change in energy from neighboring frequency bands. The energy gradually increases or decreases from one frequency band to another frequency band. The norm factors that represent energy also change gradually. The quantization indices of the norm factor would also change gradually and, therefore, the differential indices would vary to a small extent.
[00061] The abrupt change in energy occurs only when some major sound components, which have great energy, start to have an effect on the frequency band or their effects start to diminish. The norm factors that represent energy also change abruptly from the previous frequency band, norms of norm factor quantification would also suddenly increase or decrease by a large amount. Thus, it results in a very large or very small differential index.
[00062] As an example, consider that there is a main sound component that has great energy in frequency subband n. Although in the frequency sub-band (n-1) and (n + 1), there is no main sound component. So, according to the Huffman table in Figure 2, the Index (n) will have a very small value, while the Index (n-1) and the Index (n + 1) will have a very large value. So, according to Equation (10), the Diff_index (n) is quite small (less than (15-Limit)) and the Diff_index (n + 1) is quite large. If a modification is made to Equation (11), then according to Equation (12) below, the upper limit of the differential indices can possibly be reduced, therefore, the extent of the differential indices can be decreased. [12] Diff_index_new (n-1) <(15 - Limit) Diff_index (n-1) - (15 - limit) <0 Diff_index_new (n) = Diff_index (n) + Diff_index (n-1) - (15 - Limit ); New_diff_index (n) <Index_diff (n) (Equation 12)
[00063] As shown in Figure 14, on the decoder side, to perfectly reconstruct the differential indices, a module called ‘reconstruction of differential indices’ (1403) is implemented. The reconstruction is done according to the value of the differential index for a preceding sub-band and for a limit. The decoder limit is the same as the limit used on the encoder.
[00064] The way to reconstruct the differential index (when n> 1), which corresponds to the modification in the encoder, can be done as shown in the equation below, the first differential index would be directly received since it has not been modified on the side of the encoder: [13] if Index_Diff (n-1)> (15 + Limit), Index_Diff (n) = New_Diff_Index (n) - Index_Diff (n-1) + (15 + Limit); or if Diff_index (n-1) <(15 - Limit), Diff_index (n) = Diff_index_number (n) - Diff_index (n-1) + (15 - Limit); or else Diff_index (n) = Diff_index_new (n); (Equation 13) where, n> 1, Diff_index (n) means differential index for subband n, Diff_index (n-1) means differential index for subband n-1, Diff_index_new (n) means the new index differential for subband n, and Limit means the value to check whether a reconstruction of the differential index should be made.
[00065] As shown in Equation (11) above and in Equation (13), the need to modify a differential index and how much it must be modified depends entirely on the differential index for the preceding frequency band. If the differential index for the preceding frequency band can be perfectly reconstructed, then the current differential index can also be perfectly reconstructed.
[00066] As shown in Equation above (11) and Equation (13), the first differential index is not modified on the encoder side, it is directly received and can be perfectly reconstructed, therefore, the second differential index can be reconstructed according to the value of the first differential index; the third differential index, the fourth differential index, and so on, following the same procedure, all the differential indexes can be perfectly reconstructed.
[00067] The merit of this modality is that the extension of the differential indices can be reduced and even so the differential indices can be perfectly reconstructed on the decoder side. Therefore, the efficiency of the bits can be improved by maintaining the same bit accuracy as the quantization indices.
[00068] In addition, although some cases have been described with the above modalities where the present invention is configured by hardware, the present invention can also be implemented by software in combination with hardware.
[00069] Each function block used in the description of the modality mentioned above can typically be implemented as an LSI consisting of an integrated circuit. This can be individual chips or can be included partially or totally on a single chip. The term "LSI" is adopted here, but it can also be referred to as "IC", "LSI system," "super LSI" or "ultra LSI" depending on different extensions of integration.
[00070] In addition, the method of circuit integration is not limited to that of LSI and implementation using processors with dedicated circuit or common purpose is also possible. After the production of the LSI, the use of an FPGA (Field Programmable Logic Port) or a reconfigurable processor, where the connections and configurations of the circuit cells within an LSI can be reconfigured, is also possible.
[00071] In addition, if the integrated circuit technology were to replace that of LSI as a result of the advancement of semiconductor technology or other derivative technology, it is also naturally possible to effect the integration of function blocks using this technology. Application of biotechnology is also possible.
[00072] Description of Japanese Patent Application No. 201194295, filed on April 20, 2011 and Japanese Patent Application No. 2011-133432, filed on June 15, 2011, which includes the specification, drawings and summary is incorporated herein in its entirety by reference. Industrial Applicability
[00073] The encoding apparatus, decoding apparatus and the encoding and decoding methods according to the present invention are applicable to a wireless terminal device, to a base station device in a mobile communication system, to a teleconferencing terminal device, a video conference terminal device and a voice over internet protocol (VOIP) terminal device. Reference Listing 101 Transient detector 102 Transform 103 Estimation of norm 104 Quantization and coding of norm 105 Normalization of spectrum 106 Adjustment of norm 107 Bit allocation 108 Quantization and coding of interlacing 109 Adjustment of noise level 110 Multiplexing 111 112 113 114 115 301 302 303 304 305 306 501 502 503 504 505 506 507 508 509 1001 1002 1003 1004 1005 1006 1101 1102 1103 1104 Demultiplexing Interlacing decoding Spectral fill generator Envelope modeling Reverse transform Scalar quantization (32 steps) Scalar quantization (40 steps) Transmission direct (5 bits) Difference Fixed length coding Huffman coding Psychoacoustic model Index modification Difference extension Checking Huffman code table selection Huffman coding selection Huffman decoding Sum Psychoacoustic model Index modification Difference Extension check Probability Codes Derivation Huffman model Psychoacoustic model Index modification Difference Extension check Selection of the Huffman code table Difference Re-stocking of differential indexes Huffman coding Difference index checking Extension Selection of the Huffman code table Difference Modification of differential indices Extension check Selection of the Huffman code table Huffman coding Selection of the Huffman code table Huffman coding Reconstruction of the differential indices 1105 1106 1107 1108 1201 1202 1203 1204 1205 1301 1302 1303 1304 1305 1401 1402 1403 1404 Sum

权利要求:
Claims (6)
[0001]
1. Audio / speech coding device, characterized by the fact that: a processor; a memory; a transformation section adapted to transform an input signal in the time domain into a frequency spectrum; a band-splitting section adapted to divide the frequency spectrum into a plurality of bands; a norm factor computation section adapted to calculate the level of norm factors for each band; a quantization section adapted to quantify the norm factors for each band; a differential index computation section (1301) adapted to calculate differential indices between a Nésima band index and a th (N-1) band index, where N is an integer of 1 or more; a differential index modification section (1302) adapted to modify a range of differential indices for the Nésima band when N is an integer of 2 or more, and replace the differential index with the modified differential index and adapted to not modify the range differential indices for the Nésima band when N is an integer of 1; a Huffman coding section (1304, 1305) adapted to encode the differential indices using the Huffman table selected from a number of predefined Huffman tables; and a transmission section adapted to transmit the coded differential indices and an indication signal to indicate the Huffman table selected for an audio / speech decoding device, where when the calculated differential index of the band (N-1) is greater than a first value, the differential index modification section is adapted to modify a differential index for a Nésima band by adding a subtracted value determined by subtracting the first value of a differential index for a th (N-1) band, in that when the calculated differential index of a band (N-1) is lower than a second value, the modification section of the differential index is adapted to modify a differential index for a band Nésima by adding a subtracted value determined by subtracting the second value of a differential index for the band (N-1) is th, where the first value is a sum of a deviation value and a limit value, and the second value is a difference between the deviation value and the limit value, and the deviation value is 15, and when the calculated differential index of a band (N-1) is not greater than the first value and is not less than than the second value, the differential index modification section is adapted to not modify a differential index for the Nésima band.
[0002]
2. Audio / speech decoding device, characterized by the fact that: a reception section for receiving encoded audio / speech signals transmitted from an audio / speech encoding device; a processor; a memory; a Huffman table selection section (1401) adapted to select a Huffman table according to an indication signal to indicate the Huffman table selected by the audio / speech coding apparatus; a Huffman decoding section (1402) adapted to decode differential indices between a Nésima band index and a th (N-1) band index, where N is an integer of 1 or more, received by the audio / speech, using the selected Huffman chart; a reconstruction section of differential indices (1403) adapted to reconstruct a Nth decoded differential index using the Huffman table selected when N is an integer of 2 or more, and replaces the differential index with the reconstructed differential index, and adapted to not reconstruct the differential index Nth when N is an integer of 1; an index computation section (1404) adapted to calculate quantization indices using the reconstructed differential indices; a decanting section adapted to decant a level of norm factors for each band; and a transformation section adapted to transform a decoded spectrum that is generated using the norm factor for each band into a frequency domain for a time domain signal, where when the band's decoded differential index (N-1) is th is greater than a first value, the reconstruction section of differential indices is adapted to reconstruct a differential index for a Nésima band by subtracting a subtracted value determined by subtracting the first value of a differential index for a band (N-1) th, where when the decoded differential index of a band (N-1) is lower than a second value, the reconstruction section of differential indexes is adapted to reconstruct a differential index for a Nésima band by subtracting a subtracted value determined by subtracting the second value of a differential index for the band (N-1) is th, where the first value is a sum of a deviation value and a limit value, and the second value is a dif difference between the deviation value and the limit value, and the displacement value is 15, and when the decoded differential index of a band (N-1) is not greater than the first value and is not less than second value, the section of reconstruction of differential indexes is adapted to reconstruct a differential index decoded for the Nésima band.
[0003]
3. Audio / speech coding method, characterized by the fact that it comprises the steps of: transforming, through a transformation section, an input signal in the time domain into a frequency spectrum; dividing the frequency spectrum into a plurality of bands; calculate a level of norm factors for each band; quantify the standard factors for each band; calculate differential indices between a band index Nésima and a band index (N-1) th, where N is an integer of 1 or more; modify a range of the differential indices for the Nésima band when N is an integer of 2 or more, and replace the differential index with the modified differential index and do not modify the range of the differential indices for the Nésima band when N is an integer of 1 ; encode the differential indexes using a Huffman table selected from a series of predefined Huffman tables; and transmit the coded differential indices and an indication signal to indicate the Huffman table selected for an audio / speech decoding device, where when the calculated differential index of the band (N-1) is greater than a first value , a differential index for a band Nésima is modified by the addition of a subtracted value determined by subtracting the first value of a differential index for a band (N -1) th, where when the calculated differential index of a band (N-1 ) th is less than a second value, a differential index for a band Nésima is modified by adding a subtracted value determined by subtracting the second value from a differential index for band (N -1) th, where the first value is a sum of a deviation value and a limit value, and the second value is a difference between the deviation value and the limit value, and the displacement value is 15, and where when the calculated differential index of a band (N-1) ési ma is not greater than the first value and is not less than the second value, the differential index for the Nésima band is not modified.
[0004]
4. Audio / speech decoding method, characterized by the fact that it comprises the steps of: receiving encoded audio / speech signals transmitted from an audio / speech encoding device; select a Huffman table according to an indication sign to indicate the Huffman table selected by the audio / speech coding device; decode differential indices between a Nésima band index and a band index (N-1) th where N is an integer of 1 or more, received by the audio / speech coding device, using the selected Huffman table; reconstruct a Nth decoded differential index using the Huffman table selected when N is an integer of 2 or more, and replace the differential index with the reconstructed differential index, and do not reconstruct the Nth difference differential when N is an integer of 1; calculate quantization indices using the reconstructed differential indices; de-quantify, by a de-quantification section, a level of norm factors for each band; and transform a decoded spectrum that is generated using the norm factors for each band into a frequency domain for a time domain signal, where when the band differential index (N-1) is greater than a first value , a differential index for a Nésima band is reconstructed by subtracting a subtracted value determined by subtracting the first value of a differential index for a band (N-1) th, where when the decoded differential index of a band (N-1) is th is less than a second value, a differential index for a band Nésima is reconstructed by subtracting a subtracted value determined by subtracting the second value from a differential index for band (N-1) th, where the first value is a sum of one deviation value and a limit value, and the second value is a difference between the deviation value and the limit value, and the deviation value is 15, and where when the decoded differential index of a band (N-1 ) ith is not greater than prime the first value and is not less than the second value, a differential index decoded for the Nésima band is reconstructed.
[0005]
5. Audio / speech coding apparatus, according to claim 1, characterized by the fact that: a Huffman table selection section adapted to select the Huffman table that was designed based on a minimum value and a value differential indexes.
[0006]
6. Audio / speech coding apparatus, according to claim 1, characterized by the fact that: a section of Huffman table selection adapted to select a Huffman table that consumes fewer bits to encode the differential indexes.

类似技术:

公开号 | 公开日 | 专利标题

BR112013026850B1|2021-02-23|AUDIO / SPEECH ENCODING AND DECODING APPLIANCES AND AUDIO / SPEECH DECODING AND DECODING METHODS

ES2665766T3|2018-04-27|Mixing of input data streams and generation from there of an output data stream

ES2664090T3|2018-04-18|Filling of subcodes not encoded in audio signals encoded by transform

EP2856776B1|2019-03-27|Stereo audio signal encoder

JP2018205766A|2018-12-27|Method, encoder, decoder, and mobile equipment

AU2015295604B2|2018-09-20|Encoder, decoder, system and methods for encoding and decoding

Wang et al.2013|Context-based adaptive arithmetic coding in time and frequency domain for the lossless compression of audio coding parameters at variable rate

ES2737889T3|2020-01-16|Encoder, decoder, encoding procedure, decoding procedure and program

BR112015025009B1|2021-12-21|QUANTIZATION AND REVERSE QUANTIZATION UNITS, ENCODER AND DECODER, METHODS FOR QUANTIZING AND DEQUANTIZING

BRPI0317954B1|2017-01-03|Variable rate audio coding and decoding process

同族专利:

公开号 | 公开日

ZA201307316B|2014-12-23|

EP3096315B1|2019-10-16|

JP2016170428A|2016-09-23|

US20190122682A1|2019-04-25|

US20180166086A1|2018-06-14|

KR20180055917A|2018-05-25|

TWI573132B|2017-03-01|

US9881625B2|2018-01-30|

EP3594943A1|2020-01-15|

EP2701144B1|2016-07-27|

JP5937064B2|2016-06-22|

CN103415884A|2013-11-27|

KR101959698B1|2019-03-20|

CA3051552C|2021-09-21|

RU2013146688A|2015-05-27|

US10515648B2|2019-12-24|

KR20140022813A|2014-02-25|

PL3096315T3|2020-04-30|

TW201717194A|2017-05-16|

ES2765527T3|2020-06-09|

EP2701144A4|2014-03-26|

JPWO2012144127A1|2014-07-28|

RU2585990C2|2016-06-10|

JP2018112759A|2018-07-19|

EP3096315A3|2017-02-15|

CA2832032C|2019-09-24|

US20140114651A1|2014-04-24|

US10204632B2|2019-02-12|

CN103415884B|2015-06-10|

WO2012144127A1|2012-10-26|

CA3051552A1|2012-10-26|

JP6518361B2|2019-05-22|

TWI598872B|2017-09-11|

KR101859246B1|2018-05-17|

JP6321072B2|2018-05-09|

KR20190028569A|2019-03-18|

EP3096315A2|2016-11-23|

TW201246187A|2012-11-16|

EP2701144A1|2014-02-26|

MY164987A|2018-02-28|

CN104485111A|2015-04-01|

KR101995694B1|2019-07-02|

CA2832032A1|2012-10-26|

CN104485111B|2018-08-24|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

JP3131542B2|1993-11-25|2001-02-05|シャープ株式会社|Encoding / decoding device|

JP3186007B2|1994-03-17|2001-07-11|日本電信電話株式会社|Transform coding method, decoding method|

US5956674A|1995-12-01|1999-09-21|Digital Theater Systems, Inc.|Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels|

US5848195A|1995-12-06|1998-12-08|Intel Corporation|Selection of huffman tables for signal encoding|

US6366614B1|1996-10-11|2002-04-02|Qualcomm Inc.|Adaptive rate control for digital video compression|

JP3784993B2|1998-06-26|2006-06-14|株式会社リコー|Acoustic signal encoding / quantization method|

KR100844810B1|2000-12-22|2008-07-09|소니 가부시끼 가이샤|Encoder and decoder|

US6411226B1|2001-01-16|2002-06-25|Motorola, Inc.|Huffman decoder with reduced memory size|

JP2002268693A|2001-03-12|2002-09-20|Mitsubishi Electric Corp|Audio encoding device|

US20040120404A1|2002-11-27|2004-06-24|Takayuki Sugahara|Variable length data encoding method, variable length data encoding apparatus, variable length encoded data decoding method, and variable length encoded data decoding apparatus|

JP2003233397A|2002-02-12|2003-08-22|Victor Co Of Japan Ltd|Device, program, and data transmission device for audio encoding|

EP1734511B1|2002-09-04|2009-11-18|Microsoft Corporation|Entropy coding by adapting coding between level and run-length/level modes|

JP4369140B2|2003-02-17|2009-11-18|パナソニック株式会社|Audio high-efficiency encoding apparatus, audio high-efficiency encoding method, audio high-efficiency encoding program, and recording medium therefor|

JP4212591B2|2003-06-30|2009-01-21|富士通株式会社|Audio encoding device|

EP1513137A1|2003-08-22|2005-03-09|MicronasNIT LCC, Novi Sad Institute of Information Technologies|Speech processing system and method with multi-pulse excitation|

US7966424B2|2004-03-15|2011-06-21|Microsoft Corporation|Data compression|

US7668715B1|2004-11-30|2010-02-23|Cirrus Logic, Inc.|Methods for selecting an initial quantization step size in audio encoders and systems using the same|

MX2008010836A|2006-02-24|2008-11-26|France Telecom|Method for binary coding of quantization indices of a signal envelope, method for decoding a signal envelope and corresponding coding and decoding modules.|

JP5010197B2|2006-07-26|2012-08-29|株式会社東芝|Speech encoding device|

EP2054879B1|2006-08-15|2010-01-20|Broadcom Corporation|Re-phasing of decoder states after packet loss|

JP4823001B2|2006-09-27|2011-11-24|富士通セミコンダクター株式会社|Audio encoding device|

EP2054875B1|2006-10-16|2011-03-23|Dolby Sweden AB|Enhanced coding and parameter representation of multichannel downmixed object coding|

US7966175B2|2006-10-18|2011-06-21|Polycom, Inc.|Fast lattice vector quantization|

US7953595B2|2006-10-18|2011-05-31|Polycom, Inc.|Dual-transform coding of audio signals|

RU2406165C2|2007-02-14|2010-12-10|ЭлДжи ЭЛЕКТРОНИКС ИНК.|Methods and devices for coding and decoding object-based audio signals|

WO2009004727A1|2007-07-04|2009-01-08|Fujitsu Limited|Encoding apparatus, encoding method and encoding program|

KR101426788B1|2007-11-20|2014-08-06|삼성전자주식회사|Apparatus and method for reporting channel quality indicator in wireless communication system|

US8630848B2|2008-05-30|2014-01-14|Digital Rise Technology Co., Ltd.|Audio signal transient detection|

US8463603B2|2008-09-06|2013-06-11|Huawei Technologies Co., Ltd.|Spectral envelope coding of energy attack signal|

US8194862B2|2009-07-31|2012-06-05|Activevideo Networks, Inc.|Video game system with mixing of independent pre-encoded digital audio bitstreams|

JP5358818B2|2009-10-27|2013-12-04|株式会社ユーシン|Locking and unlocking device for doors|

JP2011133432A|2009-12-25|2011-07-07|Shizuoka Oil Service:Kk|Oil viscosity checker and oil supply system using the same|

US9106925B2|2010-01-11|2015-08-11|Ubiquity Holdings, Inc.|WEAV video compression system|

CN102222505B|2010-04-13|2012-12-19|中兴通讯股份有限公司|Hierarchical audio coding and decoding methods and systems and transient signal hierarchical coding and decoding methods|

CA2832032C|2011-04-20|2019-09-24|Panasonic Corporation|Device and method for execution of huffman coding|KR100715450B1|2004-02-02|2007-05-07|경안인더스트리|Asbestos-free insulation board and manufacturing method thereof|

CA2832032C|2011-04-20|2019-09-24|Panasonic Corporation|Device and method for execution of huffman coding|

MX341885B|2012-12-13|2016-09-07|Panasonic Ip Corp America|Voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method.|

KR102192245B1|2013-05-24|2020-12-17|돌비 인터네셔널 에이비|Audio encoder and decoder|

KR102270106B1|2013-09-13|2021-06-28|삼성전자주식회사|Energy lossless-encoding method and apparatus, signal encoding method and apparatus, energy lossless-decoding method and apparatus, and signal decoding method and apparatus|

CN105723454B|2013-09-13|2020-01-24|三星电子株式会社|Energy lossless encoding method and apparatus, signal encoding method and apparatus, energy lossless decoding method and apparatus, and signal decoding method and apparatus|

US20150142345A1|2013-10-18|2015-05-21|Alpha Technologies Inc.|Status Monitoring Systems and Methods for Uninterruptible Power Supplies|

WO2015130509A1|2014-02-28|2015-09-03|Dolby Laboratories Licensing Corporation|Perceptual continuity using change blindness in conferencing|

US10553228B2|2015-04-07|2020-02-04|Dolby International Ab|Audio coding with range extension|

MX2018002967A|2015-09-13|2018-06-11|Alpha Tech Inc|Power control systems and methods.|

US10381867B1|2015-10-16|2019-08-13|Alpha Technologeis Services, Inc.|Ferroresonant transformer systems and methods with selectable input and output voltages for use in uninterruptible power supplies|

CN110140330B|2017-01-02|2021-08-13|杜塞尔多夫华为技术有限公司|Apparatus and method for shaping probability distribution of data sequence|

KR20190122709A|2017-03-14|2019-10-30|소니 주식회사|Recording device, recording method, playback device, playback method and recording and playback device|

US20180288439A1|2017-03-31|2018-10-04|Mediatek Inc.|Multiple Transform Prediction|

US10635122B2|2017-07-14|2020-04-28|Alpha Technologies Services, Inc.|Voltage regulated AC power supply systems and methods|

CN109286922B|2018-09-27|2021-09-17|珠海市杰理科技股份有限公司|Bluetooth prompt tone processing method, system, readable storage medium and Bluetooth device|

法律状态:
2018-01-16| B25A| Requested transfer of rights approved|Owner name: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AME |

2018-12-18| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|

2020-09-01| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|

2021-01-05| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|

2021-02-23| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 12/03/2012, OBSERVADAS AS CONDICOES LEGAIS. |

优先权:

申请号 | 申请日 | 专利标题

JP2011-094295|2011-04-20|

JP2011094295|2011-04-20|

JP2011133432|2011-06-15|

JP2011-133432|2011-06-15|

PCT/JP2012/001701|WO2012144127A1|2011-04-20|2012-03-12|Device and method for execution of huffman coding|

[返回顶部]