巴西专利BR112012021359B1 HIERARCHICAL AUDIO CODING METHOD, HIERARCHICAL AUDIO DECODING METHOD, HIERARCHICAL AUDIO CODING METH

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
hierarchical audio encoding method, hierarchical audio decoding method, hierarchical audio encoding method for transient signals, hierarchical decoding method for transient signals, and, hierarchical audio encoding system. a hierarchical audio encoding and decoding method and system is provided, and a hierarchical audio encoding and decoding method for transient signals. The hierarchical audio coding method comprises: performing a transient detection on an audio signal of a current frame (10); the execution of a time frequency transformation to obtain the total frequency domain coefficients of the current frame (20); the quantification and coding of values of amplitude envelopes of coding subbands of the core layer and coding subbands of the extended layer, to obtain the quantification indices of the amplitude envelopes and coded bits of them of the subband coding of the core layer and coding sub-bands of the extended layer (30); quantifying and coding the frequency domain coefficients of the core layer to obtain the encoded bits of the frequency domain coefficients of the core layer (40); performing an inverse quantification of the frequency domain coefficients in the core layer that are performed with the vector quantization, and performing a differential calculation with original frequency domain coefficients, to obtain the residual signal of the core layer (50), and perform the calculation of quantification indexes of the amplitude of those (60); quantizing and encoding encoded signals from the extended layer to obtain the encoded bits of encoding signals from the extended layer (70); and multiplexing and packaging the encoded bits of the encoding subband amplitude envelopes of the core layer and the extended layer, the encoded bits of the core domain's frequency domain coefficients and the encoded bits of the encoding signals of the layer enlarged and then make the transmission to the decoding terminal (80).
公开号:BR112012021359B1
申请号:R112012021359-8
申请日:2011-01-12
公开日:2020-12-15
发明作者:Ke Peng；Guoming Chen；Hao Yuan；Dongping Jiang；Jiali Li
申请人:Zte Corporation；
IPC主号:

专利说明:

TECHNICAL FIELD
[001] The present invention relates to an audio encoding and decoding technology, and in particular, to a method and system for hierarchical audio encoding and decoding, and to a hierarchical encoding and decoding method for transient signals. BACKGROUND OF RELATED TECHNIQUE
[002] Hierarchical audio coding is dedicated to the organization of bit streams resulting from audio coding in a hierarchical way, which are generally divided into a core layer and several extended layers. A decoder can be implemented only to decode the encoded bit stream from a lower layer (such as the core layer) in a situation where there is no encoded bit stream available from an upper layer (such as an extended layer), and the more layers are decoded, the better the audio quality will be.
[003] The technology of hierarchical coding has a very important practical value for a communications network. On the one hand, data transfer can be completed by the cooperation of different channels, and the packet loss rate for each channel can be different; and at this point, it is usually necessary to perform a hierarchical process on the data, place important parts of the data on stationary channels with relatively low packet loss rates for transmission, and place secondary parts of the data on non-stationary channels with relatively low packet loss rates high for transmission, in order to guarantee that only a relative reduction of the audio quality occurs, when the packet loss in the non-stationary channels occurs, without the condition that a data frame cannot be completely decoded. On the other hand, the bandwidth of some communication networks (such as the Internet) is very unstable, and the bandwidths of different user terminals are varied. It is possible to use a fixed bit rate to comply with the requirements of users with different bandwidths, while using the hierarchical encoding scheme allows different users to obtain the respective optimal tasting regarding the sound quality, under their own conditions. bandwidth conditions themselves.
[004] Traditional hierarchical audio coding schemes, such as G.729.1 and G.VBR, of the International Telecommunication Union (ITU), do not perform a process directed at transient signal frames, and therefore, for signals comprising components fundamental transients (such as a percussion signal), encoding efficiency is reduced, especially with moderate and low bit rates. SUMMARY OF THE INVENTION
[005] The technical problem to be solved by the present invention is to provide an efficient hierarchical audio encoding and decoding method and system, and a hierarchical encoding and decoding method for transient signals, in order to improve the quality of hierarchical encoding and decoding of audio.
[006] In order to solve the problem mentioned above, the present invention provides a hierarchical audio encoding method comprising:
[007] the execution of a transient detection in an audio signal of a current frame;
[008] when the transient detection has to be a steady state signal, a time frequency transformation into an audio signal is performed to obtain total frequency domain coefficients; when the transient detection is to be a transient signal, the audio signal is divided into M subframes, the time frequency transformation is performed in each subframe, the M groups of the frequency domain coefficients obtained by the transformation constitute the total coefficients of frequency domain of the current frame, the total frequency domain coefficients are reorganized so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies, in which the total frequency domain coefficients comprise domain coefficients of core layer frequency and extended layer frequency domain coefficients, the coding subbands comprise core layer coding subbands and extended layer coding subbands, the frequency layer coefficients of the extended layer core constitute several coding sub-bands of the core layer, and the domain coefficients frequency of the extended layers constitute several coding sub-bands of the extended layers;
[009] the quantification and coding of amplitude envelope values of the encoding subbands of the core layer and coding subbands of the extended layer, to obtain quantification indices of amplitude envelopes and encoded bits of amplitude envelopes the coding subbands of the core layer and coding subbands of the extended layer; where, if the signal is the steady-state signal, the amplitude envelope values of the core layer coding sub-bands and the extended layer coding sub-bands are quantified together, and if the signal is the transient signal, the amplitude envelope values of the core layer coding sub-bands and the extended layer coding sub-bands are quantified separately, respectively, and the amplitude envelope indices of the amplitude sub-bands of coding of the core layer and the quantification indices of amplitude envelopes of the coding sub-bands of the extended layer are reorganized respectively;
[0010] the execution of a bit allocation in the coding subbands of the core layer, according to the quantification indexes of amplitude of the encoding subbands of the core layer, and then quantifying and coding the coefficients frequency domain of the core layer to obtain encoded bits of the frequency domain coefficients of the core layer;
[0011] inverse quantify the frequency domain coefficients described above in the core layer, which are performed with a vector quantification, and the performance of a differential calculation with original frequency domain coefficients, which are obtained after be performed with the transformation of the time frequency, to obtain residual signals from the core layer;
[0012] the calculation of quantification indices of amplitude envelopes of the residual signals of the core layer, according to bit allocation numbers and quantification indices of amplitude envelopes of the coding sub-bands of the core layer;
[0013] the execution of bit allocation in the coding sub-bands of the encoding signals of the extended layer according to the quantification indices of amplitude evolvent of the residual signals of the core layer and with the quantification indices of amplitude envelopes of the encoding subbands of the extended layer, and then quantifying and encoding the encoding signals of the extended layer to obtain encoded bits of the encoding signals of the extended layer, where the encoding signals of the extended layer are comprised of the residual signals of core layer and frequency domain coefficients of the extended layer; and
[0014] the multiplexing and packaging of the encoded bits of amplitude envelopes of the encoding subbands of the core layer and encoding subbands of the extended layer, the encoded bits of the frequency domain coefficients of the core layer and the bits encoded from the encoding signals of the extended layer and then proceed to transmission to a decoding terminal.
[0015] In order to solve the problem mentioned above, the present invention further provides a method of hierarchical audio decoding comprising:
[0016] demultiplex a bit stream transmitted by an encoding terminal, decode the encoded bits of amplitude envelopes of encoding subbands of the core layer and encoding subbands of the extended layer, to obtain quantification indices amplitude surrounds of the coding subbands of the core layer and coding subbands of the extended layer; if the transient detection information indicates a transient signal, the quantification indices for the amplification of amplitudes of the coding sub-bands of the core layer and of the coding sub-bands of the extended layer are reorganized, respectively, so that their corresponding frequencies, are aligned from low to high within the respective layers;
[0017] the execution of a bit allocation in the coding subbands of the core layer according to the quantification indices of amplitude of the encoding subbands of the core layer, thereby calculating the quantization indices of amplitude envelopes of the core layer residual signals, and the execution of bit allocation in the coding subbands of the amplified layer coding signals, according to the quantification indexes of amplitude envelopes of the core layer residual signals and the quantification indexes of amplitude surroundings of the coding sub-bands of the extended layer;
[0018] the decoding of encoded bits of the frequency domain coefficients of the core layer and encoded bits of encoding signals of the extended layer, respectively, according to the bit allocation numbers of the encoding subbands of the core layer and the coding subbands of the coding signals of the extended layer, to obtain the coefficients of the frequency domain of the core layer and coding signals of the extended layer, and rearranging the coding signals of the extended layer in an order of the sub -bands and add them with the frequency domain coefficients of the core layer, to obtain frequency domain coefficients with full bandwidth; and
[0019] if the transient detection information indicates a steady state signal, an inverse time frequency transformation is performed directly on the frequency domain coefficients with full bandwidth, to obtain an audio signal for output; and if the transient detection information indicates a transient signal, the frequency domain coefficients of the total bandwidth are reorganized, then dividing them into groups of frequency domain coefficients M, the inverse time frequency transformation is performed in each group of frequency domain coefficients, and the calculation is made to obtain a final audio signal according to the groups M of the time domain signals obtained by the transformation.
[0020] In order to solve the problem mentioned above, the present invention also provides a hierarchical audio coding method for transient signals, comprising:
[0021] the division of an audio signal into subframes M, the execution of a time frequency transformation in each subframe, the groups M of the frequency domain coefficients obtained by the transformation constituting total frequency domain coefficients of a current frame , the reorganization of the total frequency domain coefficients so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies, in which the total frequency domain coefficients comprise core layer frequency domain coefficients and frequency domain coefficients of the extended layer, the coding subbands comprise core layer coding subbands and encoded subbands of the extended layer, the frequency domain coefficients of the core layer constitute several subbands encoding of the core layer, and the frequency domain coefficients of the extended layer constitute di reverse coding sub-bands of the extended layer;
[0022] the quantification and encoding of amplitude enveloping values of the coding sub-bands of the core layer and coding sub-bands of the extended layer, to obtain quantification indices of amplitude envelopes and coded bits of the sub-bands of encoding the core layer and sub-bands encoding the extended layer; where the values of amplitude envelopes of the coding subbands of the core layer and of the coding subbands of the extended layer are quantified separately, respectively, and the indices of quantification of amplitude envelopes of the coding subbands of the core layer and the quantification indices of amplitude envelopes of the coding sub-bands of the extended layer are reorganized respectively;
[0023] the execution of a bit allocation in the coding subbands of the core layer, according to the quantification indices of amplitude of the amplitudes of the coding subbands of the core layer, and then quantifying and coding the coefficients frequency domain of the core layer to obtain encoded bits of the frequency domain coefficients of the core layer;
[0024] the inverse quantification of the frequency domain coefficients described above in the core layer, which are performed with a vector quantification, and the performance of a differential calculation with original frequency domain coefficients, which are obtained after being executed with the transformation of the time frequency, to obtain residual signals from the core layer;
[0025] the calculation of the quantification indexes of the amplitude of the encoding sub-bands of the residual signals of the core layer, according to the quantification indices of the amplitude of the amplitude of the coding sub-bands of the core layer and numbers bit allocation of the core layer coding subbands;
[0026] the execution of a bit allocation in the coding sub-bands of the coding signals of the extended layer in accordance with the quantification indices of amplitude evolvent of the residual signals of the core layer and with the quantification indices of the envelope surrounds. amplitude of the encoding subbands of the extended layer, and then quantifying and encoding the encoding signals of the extended layer to obtain encoded bits of the encoding signals of the extended layer, where the encoding signals of the extended layer are comprised of the residual signals core layer and frequency domain coefficients of the extended layer; and
[0027] the multiplexing and packaging of the encoded bits of amplitude envelopes of the encoding subbands of the core layer and encoding subbands of the extended layer, the encoded bits of the frequency domain coefficients of the core layer and the bits encoded from the encoding signals of the extended layer and then proceed to transmission to a decoding terminal.
[0028] In order to solve the problem mentioned above, the present invention also provides a hierarchical decoding method for transient signals, comprising:
[0029] the demultiplexing of a bit stream transmitted by an encoding terminal, the decoding of the encoded bits of amplitude envelopes of the encoding subbands of the core layer and of encoding subbands of the extended layer, to obtain quantification indices of amplitude envelopes of the coding subbands of the core layer and coding subbands of the extended layer, the reorganization of the quantification indices of amplitude envelopes of the coding subbands of the core layer and of the sub layers - coding bands of the extended layer, respectively, so that their corresponding frequencies are aligned from low to high within the respective layers;
[0030] the execution of a bit allocation in the coding sub-bands of the core layer, according to the quantification indices of amplitude surroundings reorganized in the coding sub-bands of the core layer, and thus calculating the indexes of quantification of residual signal amplitude surroundings of the core layer;
[0031] the execution of bit allocation in the coding sub-bands of the extended layer, according to the quantification indices of amplitude of residual signals of the core layer, and the quantification indices of amplitude envelopes reorganized from the sub - encoding bands of the extended layer;
[0032] the decoding of encoded bits of frequency domain coefficients of the core layer and encoded bits of encoding signals of the extended layer, respectively, according to bit allocation numbers of the encoding subbands of the core layer and coding subbands of the coding signals of the extended layer, to obtain the coefficients of the frequency domain of the core layer and coding signals of the extended layer, and rearranging the coding signals of the extended layer in an order of the subbands and adding them with the frequency domain coefficients of the core layer, to obtain total bandwidth frequency domain coefficients; and
[0033] reorganizing the frequency domain coefficients of the total bandwidth and then dividing into M groups, performing an inverse time frequency transformation on each group of frequency domain coefficients, and doing the calculation to obtain a final audio signal according to the M groups of time domain signals obtained by the transformation.
[0034] In order to solve the problem mentioned above, the present invention further provides a hierarchical audio encoding system comprising:
[0035] a frequency domain coefficient generation unit, an amplitude envelope calculation unit, an amplitude envelope encoding and quantification unit, a core layer bit allocation unit, a coding unit and vector quantification of frequency domain coefficients of the core layer, a bit stream multiplexer; and further comprises: a transient detection unit, a unit of generation of encoding signals of the extended layer, a unit of generation of surrounds of amplitude of residual signals, a unit of bit allocation of the expanded layer, and a unit of encoding and vector quantification of encoding signals from the extended layer; on what
[0036] the transient detection unit is configured to perform a transient detection on an audio signal of a current frame;
[0037] the frequency domain coefficient generating unit is connected to the transient detection unit, and is configured for: when transient detection has to be a steady state signal, a time frequency transformation into a signal is performed of audio to obtain total frequency domain coefficients; when the transient detection is to be a transient signal, the audio signal is divided into M subframes, the time frequency transformation is performed in each subframe, the total frequency domain coefficients of the current frame are constituted by the M groups of coefficients frequency domain obtained by the transformation, the total frequency domain coefficients are reorganized so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies, in which the total frequency domain coefficients comprise domain coefficients frequency of the core layer and frequency domain coefficients of the extended layer, the coding subbands comprise coding subbands of the core layer and coding subbands of the extended layer, the frequency domain coefficients of the layer nuclei constitute several coding sub-bands of the nucleus layer, and the domain coefficients of f magnified layer frequency constitute several coding sub-bands of the extended layer;
[0038] the amplitude envelope calculation unit is connected to the frequency domain coefficient generation unit and is configured to calculate amplitude envelope values of the core layer coding sub-bands and coding sub-bands of the enlarged layer;
[0039] the amplitude envelope encoding and quantification unit is connected to the amplitude envelope calculating unit and the transient detection unit and is configured for the quantification and encoding of amplitude enveloping values of the coding sub-bands of the core layer and coding subbands of the extended layer, to obtain quantification indices of amplitude envelopes and encoded bits of amplitude envelopes of the coding subbands of the core layer and coding subbands of the extended layer; where, if the signal is the steady-state signal, the amplitude envelope values of the core layer coding sub-bands and the extended layer coding sub-bands are quantified together, and if the signal is the transient signal, the amplitude envelope values of the core layer coding sub-bands and the extended layer coding sub-bands are quantified separately, respectively, and the amplitude envelope indices of the amplitude sub-bands of coding of the core layer and the quantification indices of amplitude envelopes of the coding sub-bands of the extended layer are reorganized, respectively;
[0040] the core layer bit allocation unit is connected to the amplitude envelope encoding and quantification unit and is configured to perform bit allocation in the core layer coding sub-bands, according to the indices quantifying the amplitude surroundings of the coding subbands of the core layer, to obtain bit allocation numbers of the coding subbands of the core layer;
[0041] the frequency domain coefficient vector encoding and quantification unit of the core layer is connected to the frequency domain coefficient generation unit, the amplitude envelope encoding and quantification unit and the allocation unit bits of the core layer and is configured to: perform normalization, vector quantization and coding in the frequency domain coefficients of the coding subbands of the core layer, using the bit allocation numbers of the sub - core layer coding bands and the quantized amplitude envelope values of the core layer coding subband reconstructed according to the amplitude envelope quantification indices of the core layer coding subband for obtaining encoded bits of the core layer frequency domain coefficients;
[0042] the extended layer encoding signal generation unit is connected to the frequency domain coefficient generation unit and to the core layer frequency domain coefficient vector encoding and quantification unit, and is configured to generating residual signals from the core layer to obtain encoding signals from the extended layer comprised by the residual signals from the core layer and frequency domain coefficients from the extended layer;
[0043] the residual signal amplitude envelope generation unit is connected to the amplitude envelope encoding and quantification unit and the core layer bit allocation unit, and is configured to obtain amplitude envelope quantification indices the residual signals of the core layer, according to the quantification indices of amplitude of the encoding sub-bands of the core layer and bit allocation numbers of the corresponding coding sub-bands of the core layer;
[0044] the extended layer bit allocation unit is connected to the residual signal amplitude envelope generation unit and to the amplitude envelope encoding and quantification unit, and is configured to perform bit allocation in the sub- coding bands of the encoded signals of the extended layer, according to the quantification indices of amplitude of the residual signals of the core layer, and quantification indices of amplitude envelopes of the coding sub-bands of the extended layer, in order to obtaining the bit allocation numbers of the encoding subbands of the encoding signals of the extended layer;
[0045] the encoding and quantizing unit of amplified layer encoding signal vectors is connected to the amplitude encoding and quantizing unit, amplitude layer bit allocation unit, amplitude envelope generation unit residual signals, and encoding signal generation unit of the extended layer, and is configured to: perform normalization, vector quantization and encoding in the encoding signals of the extended layer through the use of bit allocation numbers of the sub-bands encoding signals of the amplified layer encoding signals and the quantized amplitude envelope values of the encoding subbands of the amplified layer encoding signals, reconstructed according to the envelope quantification indices of the amplitude of the encoding subband bands. encoding signals of the extended layer, to obtain encoded bits of the encoding signals of the am layer pliada;
[0046] the bitstream multiplexer is connected to the encoding and quantizing unit of amplitude envelopes, to the encoding and quantizing unit of frequency coefficients of the core layer frequency, encoding unit and vector quantization encoding signals of the extended layer, and is configured to package bits of secondary information from the core layer, the encoded bits of amplitude envelopes of the encoding subbands of the core layer, the encoded bits of the frequency domain coefficients of the core layer, secondary information bits of the extended layer, the encoded bits of encoding subband amplitude amplitudes of the extended layer and the encoded bits of the encoded signals of the extended layer.
[0047] In order to solve the problem mentioned above, the present invention also provides a hierarchical audio decoding system comprising: a bit stream demultiplexer, an amplitude envelope decoding unit, a bit allocation unit of core layer, a unit for reverse quantification and decoding of the core layer; and further comprising: a unit for generating residual signal amplitude envelopes, an expanded layer bit allocation unit, an inverse quantification and decoding unit for extended layer encoding signals, a domain coefficient recovery unit full bandwidth frequency, a noise filling unit and an audio signal recovery unit; on what
[0048] the amplitude envelope decoding unit is connected to the bitstream demultiplexer, and is configured to: decode the encoded bits of amplitude envelopes of the encoding subbands of the core layer and encoding subbands the expanded layer, whose output is given by the bit stream demultiplexer, to obtain quantification indices of amplitude envelopes of the coding subbands of the core layer and coding subbands of the expanded layer; and if the transient detection information indicates a transient signal, the quantification indices for the amplification of amplitudes of the coding sub-bands of the core layer and of the coding sub-bands of the extended layer are additionally reorganized, in an order of frequencies, from low to high;
[0049] the core layer bit allocation unit is connected to the amplitude envelope decoding unit and is configured to perform bit allocation in the core layer coding subbands, according to the quantification indices amplitude surrounds of the core layer coding subbands to obtain bit allocation numbers of the core layer coding subbands;
[0050] the inverse quantization and decoding unit of the core layer is connected to the bit stream demultiplexer, amplitude decoding unit of amplitude and bit allocation unit of the core layer, and is configured to: make the calculation to obtain quantified amplitude envelope values of the core layer coding subbands, according to the encoding subband amplitude indexes of the core layer coding, perform decoding, inverse quantification and process of inverse normalization in encoded bits of frequency domain coefficients of the core layer, sent by the bitstream demultiplexer through the use of bit allocation numbers and quantized amplitude envelope values of the encoding sub-bands of the core, to obtain frequency domain coefficients of the core layer;
[0051] the residual signal amplitude envelope generation unit is connected to the amplitude envelope decoding unit and the core layer bit allocation unit, and is configured to: search for a statistical table of values of correction of the quantification indexes of amplitude envelopes of the residual signals of the core layer, according to the quantification indices of amplitude envelopes of the coding subbands of the core layer and bit allocation numbers of the coding subbands the corresponding core layer, in order to obtain the quantification indices of amplitude of the residual signals of the core layer;
[0052] the expanded layer bit allocation unit is connected to the residual signal amplitude envelope generation unit and the amplitude envelope decoding unit, and is configured to: perform bit allocation in sub-bands of encoding of encoding signals of the extended layer, according to the quantification indices of amplitude of the residual signals of the core layer, and indices of quantification of amplitudes of the encoding sub-bands of the extended layer, to obtain numbers bit allocation of the encoding subbands of the encoded signals of the extended layer;
[0053] the inverse quantization and decoding unit of encoded signals of the extended layer is connected to the bit stream demultiplexer, amplitude envelope decoding unit, expanded layer bit allocation unit, and envelope generation unit amplitude of residual signals, and is configured to: perform the calculation to obtain quantified amplitude envelope values of the coding sub-bands of the amplified layer encoding signals, using the quantification indices of amplitude envelopes encoding subbands of encoded signals from the extended layer, and perform decoding, inverse quantization and inverse normalization process on encoded bits of encoding signals from the extended layer, sent by the bitstream demultiplexer using the allocation numbers bits and quantized amplitude envelope values of the codeband sub-bands coding the encoding signals of the enlarged layer, to obtain the encoding signals of the enlarged layer;
[0054] the full bandwidth frequency domain coefficient recovery unit is connected to the core layer reverse quantification and decoding unit and the extended layer encoding signals reverse quantization and decoding unit, and is configured to : rearrange the encoding signals of the extended layer, sent by the inverse quantization and decoding unit of encoding signals of the extended layer, in an order of the subbands, and then add them with the frequency domain coefficients of the core layer , sent by the unit of inverse quantification and decoding of the core layer, to obtain the frequency domain coefficients of the total bandwidth;
[0055] the filling noise unit is connected to the recovery unit of total bandwidth frequency domain coefficients and to the amplitude decoding unit of amplitude, and is configured to perform filling noise in sub-bands for the which encoded bits are not allocated in the encoding process;
[0056] the audio signal recovery unit is connected to the filling noise unit, and is configured for: if the transient detection information indicates a steady state signal, an inverse frequency transformation of the time coefficients is performed directly frequency domain of the total bandwidth, to obtain an audio signal for output; and if the transient detection information indicates a transient signal, the frequency domain coefficients of the total bandwidth are reorganized, then dividing them into groups of frequency domain coefficients M, the inverse time frequency transformation is performed in each group of frequency domain coefficients, and the calculation is made to obtain a final audio signal according to the groups M of the time domain signals obtained by the transformation.
[0057] In conclusion, in the present invention, by introducing a processing method for transient signal frames in the hierarchical audio encoding and decoding methods, a segmented time frequency transformation is performed in the transient signal frames, and then the coefficients frequency domains obtained by the transformation are rearranged respectively within the core layer and within the extended layer, in order to perform the same subsequent coding processes, such as bit allocation, coding of frequency domain coefficients, etc. ., such as those existing in steady-state signal frames, thus improving the coding efficiency of transient signal frames and improving the quality of hierarchical audio encoding and decoding.
[0058] BRIEF DESCRIPTION OF THE DRAWINGS
[0059] FIG. 1 is a schematic diagram of a hierarchical audio encoding method, in accordance with the present invention;
[0060] FIG. 2 is a flow chart of a hierarchical audio encoding method, according to an embodiment of the present invention;
[0061] FIG. 3 is a flow chart of a method for performing bit allocation correction after vector quantization, in accordance with the present invention;
[0062] FIG. 4 is a schematic diagram of a hierarchical stream of bits encoded in accordance with the present invention;
[0063] FIG. 5 is a schematic diagram of a relationship between a hierarchy in terms of a range of frequencies and a hierarchy in terms of a bit rate, according to the present invention;
[0064] FIG. 6 is a structural diagram of a hierarchical audio coding system, in accordance with the present invention;
[0065] FIG. 7 is a schematic diagram of a hierarchical audio decoding method, according to the present invention;
[0066] FIG. 8 is a flow chart of a hierarchical audio decoding method, according to an embodiment of the present invention; and
[0067] FIG. 9 is a structural diagram of a hierarchical audio decoding system, in accordance with the present invention. PREFERENTIAL MODALITIES OF THE PRESENT INVENTION
[0068] The main idea of the hierarchical audio encoding and decoding method and system according to the present invention is, by introducing a processing method for transient signal frames in the hierarchical audio encoding and decoding methods, a transformation is performed of segmented time frequency in the frames of transient signals, and then the frequency domain coefficients obtained by the transformation are reorganized, respectively inside the core layer and inside the extended layer, in order to execute the same subsequent coding processes , such as bit allocation, coding of frequency domain coefficients, etc., such as those existing in steady-state signal frames, thereby improving the coding efficiency of transient signal frames and improving the quality of encoding and decoding hierarchical audio. CODING METHOD AND SYSTEM
[0069] As shown in FIG. 1, based on the inventive idea mentioned above, the hierarchical audio encoding method according to the present invention comprises the following steps.
[0070] In step 10, a transient detection is performed on an audio signal of a current frame.
[0071] In step 20, the audio signal is processed according to a result of transient detection, to obtain frequency domain coefficients of a core layer and an enlarged layer.
[0072] Specifically, when the transient detection has to be a steady state signal, a time frequency transformation is performed on an audio signal to obtain total frequency domain coefficients; when transient detection is to be a transient signal, the audio signal is divided into subframes M, the time frequency transformation is performed on each subframe, and the groups M of the frequency domain coefficients obtained by the transformation constitute the total coefficients frequency domain of the current frame; and the total frequency domain coefficients are reorganized so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies; where the total frequency domain coefficients comprise core layer frequency domain coefficients and extended layer frequency domain coefficients, the coding subbands comprise core layer coding subbands and subband bands. coding of the extended layers, the frequency domain coefficients of the core layer constitute several coding sub-bands of the core layer, and the frequency domain coefficients of the extended layers constitute several coding sub-bands of the extended layers:
[0073] when the transient detection is to be the transient signal, the method of obtaining the total frequency domain coefficients of the current frame comprises:
[0074] the combination of a time domain sampling signal x (n) of point N of the current frame and a time domain sampling signal xold (n) of point N of the last frame, in a sampling signal of 2N time x (n) time domain, and then perform window management and stair effect processing in the time domain at x (n) to obtain a point x (n) time domain sampling signal N; and
[0075] perform an inversion processing on the time domain signal x (X), and then add a sequence of zeros at both ends of the signal, respectively, dividing the elongated signal into M subframes that are superimposed on each other, and then perform window management, time domain stair effect processing and time frequency transformation into the time domain signal of each subframe, to obtain M groups of frequency domain coefficients and then constitute the coefficients frequency domain totals for the current frame.
[0076] When transient detection is to be the transient signal and when frequency domain coefficients are rearranged, frequency domain coefficients are rearranged so that their corresponding coding subbands are aligned from low frequencies to high frequencies inside the core layer and inside the enlarged layer, respectively.
[0077] In step 30, the amplitude envelope values of the encoding subbands of the core layer and encoding subband of the extended layer are quantized and encoded, to obtain quantification indices of amplitude envelopes and encoded bits the coding subbands of the core layer and coding subbands of the extended layer.
[0078] Specifically, the values of amplitude envelopes of the coding subbands of the core layer and coding subbands of the extended layer are quantified and coded, to obtain the quantification indices of amplitude envelopes and coded bits of the core layer coding subbands and extended layer coding subbands; wherein, if it is the steady state signal, the amplitude envelope values of the core layer coding subbands and the extended layer coding subbands are quantified together; and if it is the transient signal, the values of amplitude envelopes of the coding subbands of the core layer and of the coding subbands of the extended layer are individually quantified separately, and the indices of quantification of amplitude envelopes of the sub layer - coding bands of the core layer and the quantification indices of amplitude surroundings of the coding sub-bands of the extended layer are reorganized respectively.
[0079] The reorganization of quantification indexes of amplitude surroundings comprises specifically:
[0080] the reorganization of the quantification indexes of amplitude envelopes of the coding subbands belonging to the same subframe, together, so that their corresponding frequencies are aligned in an ascending or descending order, and making the connection of indices of quantification of amplitude surroundings, in sub-frame boundaries, through the use of two coding sub-bands that comprise pair-to-pair frequencies and that belong to two sub-frames, respectively.
[0081] When the transient detection is to be a steady state signal, Huffman encoding is performed on the quantification indices of amplitude of the encoding sub-bands of the core layer obtained by the quantification, and if the total number of bits consumed after Huffman encoding is performed on the amplitude envelope quantification indices, of all core layer encoding sub-bands is less than the total number of bits consumed after natural encoding is performed on the envelope quantification indices amplitude of all coding subbands of the core layer, Huffman coding is used, otherwise, natural coding and the Huffman coding indicator of the amplitude envelopes of the coding subbands of the core layer is used It's defined; and Huffman coding is performed on the quantification indices of amplitude of the amplification sub-bands of the extended layer encoding obtained by quantification, and if the total number of bits consumed after Huffman coding has been performed on the quantization of envelope indices. the amplitude of all encoding subbands of the extended layer is less than the total number of bits consumed after natural encoding has been performed on the amplitude surrounding quantification indices of all encoding subbands of the extended layer, the encoding Huffman's encoding is used, otherwise, natural encoding is used, and the Huffman encoding indicator of amplitude envelopes of the encoding sub-bands of the extended layer is defined.
[0082] In step 40, bit allocation is performed in the coding subbands of the core layer, according to the quantification indices of amplitude of the encoding subbands of the core layer, and then the coefficients frequency domains of the core layer are quantified and encoded to obtain encoded bits of the frequency domain coefficients of the core layer.
[0083] The method for obtaining the encoded bits of the frequency domain coefficients of the core layer comprises:
[0084] the execution of normalization in the frequency domain coefficients of the core layer, according to the quantified amplitude envelopes values of the coding sub-bands of the core layer, which are reconstructed from envelope quantification indices the amplitude of the coding sub-bands of the core layer, and the implementation of quantification and coding using a method of quantifying a pyramid cell structure vector and a spherical cell structure vector quantification method, respectively, according to the bit allocation numbers of the coding subbands, to obtain the coded bits of the core layer frequency domain coefficients;
[0085] the execution of Huffman coding in the quantification indices of the nucleus layer, which are obtained through the use of a quantification of structure vectors by pyramid cells;
[0086] if the total number of bits consumed after Huffman coding is performed on all quantification indices, obtained by using the quantification of pyramid cell structure vectors, is less than the total number of bits consumed after natural coding be performed on all quantification indices obtained by using the quantification of structure vectors by pyramid cells, using Huffman coding, correcting the bit allocation numbers of the coding sub-bands of the core layer, using the number of bits saved by Huffman encoding, the number of bits remaining after the first bit allocation, and the total number of bits saved by encoding all encoding subbands, in which the number of bits allocated for a coefficient of simple frequency domain is 1 or 2, and vector quantification and Huffman coding are performed again on the coding sub-bands of the n layer ucleus, for which the bit allocation numbers are corrected; otherwise, if natural encoding is used, correction of the bit allocation numbers of the core layer encoding subbands is performed using the number of bits remaining after the first bit allocation and the total number of bits saved by the encoding of all coding subbands, in which the number of bits allocated to the single frequency domain coefficient is 1 or 2, and vector quantization and natural coding are performed again on the coding subbands of the data layer. core to which the bit allocation numbers are corrected.
[0087] In step 50, the frequency domain coefficients described above, in which vector quantization is performed in the core layer are inversely quantized and a differential calculation is performed between the inversely quantized frequency domain coefficients and the coefficients frequency domain originals obtained after the time frequency transformation was performed to obtain residual signals from the core layer.
[0088] In step 60, the envelope quantification indices of the core layer residual signals are calculated according to the amplitude envelope quantification indices of the core layer coding sub-bands and allocation numbers bits of the core layer encoding subbands.
[0089] The indices of quantification of amplitudes of the sub-bands encoding the residual signals of the core layer are calculated using the following method:
[0090] calculating a correction value of the quantification index of amplitude of the residual signals of the core layer, according to the bit allocation numbers of the coding subband of the core layer; and perform a differential calculation between the amplitude envelope quantification index of the core layer coding subband and the correction value of the amplifier envelope quantification index of the core layer residual signal that corresponds to the subband coding above, to obtain the quantification index of amplitudes of amplitude of the residual signal of the core layer.
[0091] The correction value of the quantification index of the amplitude of the residual signal amplitude of the core layer of each of the coding subbands is greater than or equal to 0, and does not decrease when the number of bit allocation of the sub -coding band of the corresponding core layer increases; and
[0092] when the bit allocation number of a given core layer encoding subband is 0, the correction value of the amplitude quantification index of the residual amplitude of the core layer is 0, and when the bit allocation number of a given core layer coding subband is a defined maximum bit allocation number, the residual signal amplitude envelope value of the corresponding core layer is 0.
[0093] In step 70, bit allocation is performed in the coding sub-bands of the encoded signals of the extended layer according to the quantification indices of amplitude evolvers of the residual signals of the core layer and with the quantization indices amplitude envelopes of the encoding subbands of the extended layer, and then the encoding signals of the extended layer are quantized and encoded to obtain encoded bits of the encoding signals of the extended layer, where the encoding signals of the extended layer are comprised of the residual core layer signals and frequency domain coefficients of the extended layer.
[0094] The method for obtaining the encoded bits of the encoded signals of the extended layer comprises:
[0095] the execution of normalization in encoded signals of the amplified layer according to the quantified amplitude envelopes values of the coding sub-bands of the encoded signals of the amplified layer reconstructed from quantification indices of amplitude of the amplitudes of the sub - encoding bands of encoding signals of the extended layer, and performing the quantization and encoding according to the bit allocation numbers of various encoding sub-bands of the encoding signals of the extended layer using the quantization method of pyramid cell structure vectors and spherical cell structure vectors quantification method, respectively, to obtain the encoded bits of the amplified layer encoding signals.
[0096] In the process of carrying out quantification and coding on frequency domain coefficients of the core layer and coding signals of the extended layer, a vector to be quantified from the coding subband, of which the bit allocation number is less than a classification threshold is quantified and encoded using the pyramid cell structure vector quantification method, and a vector to be quantified from the coding subband of which the bit allocation number is greater than one classification threshold, is quantified and coded using the method of quantifying structure vectors by spherical cells;
[0097] the bit allocation number is the number of bits that is allocated to a simple coefficient in a coding subband.
[0098] It can be understood that, for the encoding signals of the extended layer, the encoding signals are comprised of the residual signals of the core layer and frequency domain coefficients of the extended layer; and in a way, the residual signals from the core layer are also comprised of coefficients.
[0099] Huffman coding is performed on all quantification indices of the extended layer, which are obtained through the use of the quantification of structure vectors by pyramid cells;
[00100] if the total number of bits consumed after Huffman coding is performed on all quantification indices, obtained by using the quantification of structure vectors by pyramid cells, is less than the total number of bits consumed after natural coding be performed on all quantification indices obtained by using the quantification of structure vectors by pyramid cells, using Huffman coding, correcting the bit allocation numbers of the coding sub-bands of the coding signals of the extended layer , using the number of bits saved by Huffman encoding, the number of bits remaining after the first bit allocation, and the total number of bits saved by encoding all encoding subbands, in which the number of bits allocated for a simple frequency domain coefficient is 1 or 2, and vector quantification and Huffman coding are performed again in the sub-bands of encoding residual signals from the extended layer, for which the bit allocation numbers are corrected; otherwise, if natural encoding is used, correction of the bit allocation numbers of the encoding subbands of the encoding signals of the extended layer is performed using the number of bits remaining after the first bit allocation and the total number of bits. saved by encoding all encoding subbands, in which the number of bits allocated to a single frequency domain coefficient is 1 or 2, and vector quantization and natural encoding are performed again on the encoding subbands of the encoding signals of the extended layer, for which the bit allocation numbers are corrected.
[00101] When executing the bit allocation in the coding subbands of the core layer and in the coding subbands of the residual signals of the extended layer, the allocation of bits with variable unit range is performed in the various subbands of coding according to the quantification indexes of amplitude envelopes of the coding sub-bands.
[00102] In the bit allocation process, the unit range is 1 bit from the allocation of a bit in an encoding subband, in which the number of bit allocation is 0, whose unit range has its importance reduced after bit allocation is 1; the unit range for bit allocation is 0.5 bits when a bit is additionally allocated to an encoding subband of which a bit allocation number is greater than 0 and less than the classification threshold, and whose range of unit has its importance reduced after the allocation of bits is 0.5; and the unit range for bit allocation is 1, when a bit is additionally allocated to a coding subband, of which a bit allocation number is greater than or equal to the classification threshold, and whose unit range has its importance reduced after the allocation of bits is 1.
[00103] The process of executing the correction of bit allocation numbers of the coding sub-bands is as follows:
[00104] calculation of the number of bits available for correction; and
[00105] searches for a coding subband with the utmost importance in all coding subbands, if the number of bits allocated to that coding subband has reached a maximum value that can be allocated and assigned, adjusting the importance of this encoding subband to the lowest level, and no longer correcting the number of bit allocation for that encoding subband; Otherwise, the correction of the allocation of bits in this coding subband is performed with the utmost importance.
[00106] In the bit allocation correction process, 1 bit is allocated to an encoding subband, in which a bit allocation number is 0, and the importance after bit allocation is reduced by 1; 0.5 bit is allocated to an encoding subband, in which a number of bit allocation is greater than 0 and less than 5, and the importance after the allocation of bits is reduced by 0.5; and 1 bit is allocated to a coding subband with a bit allocation number greater than 5, and the importance after the bit allocation is reduced by 1.
[00107] When the bit allocation number is corrected from time to time, the iterative times count of the bit allocation correction is added to 1, and when the iterative times count of the bit allocation correction reaches a predefined upper limit value , or when the number of bits remaining available for correction is less than the number of bits required by bit allocation correction, the bit allocation correction process ends.
[00108] In step 80, the multiplexing and packaging of the encoded bits of amplitudes of the encoding subbands of the core layer and the extended layer, the encoded bits of the frequency domain coefficients of the core layer and the encoded bits of the encoded signals of the extended layer and then transmitted to a decoding terminal.
[00109] Demultiplexing and packaging are performed according to the following bit stream format:
[00110] first, bits of secondary information from the core layer are written, then the beginning of the bit stream plot, the bits encoded from the amplitude envelopes of the coding subbands of the core layer are written in a multiplexer of bit streams (MUX) and then write the encoded bits of the frequency domain coefficients of the core layer in MUX;
[00111] then, write the secondary information bits of the extended layer in the MUX, then write the encoded bits of amplitude envelopes of the encoding sub-bands of the frequency domain coefficients of the extended layer in the MUX and then write the encoded bits of the encoding signals from the enhanced layer in the MUX; and
[00112] transmit to the decoding terminal the number of bits that complies with the bit rate requirements, according to the required bit rate.
[00113] The present invention will be described in detail, in combination with the attached drawings and modalities shown below.
[00114] FIG. 2 is a flow chart of a hierarchical audio encoding method, according to a first embodiment of the present invention. In the present embodiment, the hierarchical audio encoding method according to the present invention is illustrated specifically by taking an audio stream with a 20 ms frame range and a sampling rate of 32 kHz, for example. In conditions with other frame ranges and sample rates, the method of the present invention is also applicable. As shown in FIG. 2 the method comprises the following steps.
[00115] In step 101, a transient detection is performed in the audio stream with the frame rate of 20 ms and the sample rate of 32 kHz, to assess whether that audio signal frame is a transient signal or a steady state, and when the signal frame is determined as a transient signal, a transient detection indicator bit Flag_transient is set to Flag_transient = 1; and when the signal frame is determined as a steady state signal, the transient detection indicator bit Flag_transient is set to Flag_transient = 0.
[00116] The transient detection technology used by the present invention can be a simple threshold detection method, or it can be more complex technologies, including but not limited to a perceptual entropy method, a multi-detection method, and so on. against.
[00117] In step 102, a time frequency transformation is performed in the audio stream with a 20 ms frame rate and a sampling rate of 32 kHz, to obtain frequency domain coefficients N at sampling points in the frequency domain.
[00118] A specific way of implementing this step can be as follows.
[00119] A 2N x (n) point time domain sampling signal consists of a N x (n) point time domain sampling signal of the current frame and a point time domain sampling signal N xold (n) of the last frame, and the 2N point time domain sampling signal can be represented by the following equation:

[00120] a window management process is performed on x (n) to obtain a windowed signal: xw (n) = h (n) x (n) (2)
[00121] where, h (n) is a window function and is defined as:

[00122] The window frame of the 40 ms x signal is transformed into an x signal with a 20 ms frame range using a time domain stair effect processing, and the operating mode is as follows:

[00123] If the transient detection indicator bit will be indicated that the current frame is a steady state signal, and the Class IV Discrete Cosine Transformation (DCTIV transform), or other discrete cosine transformation classes, is directly executed in the time domain stair effect signal (//), to obtain the following frequency domain coefficient:

[00124] If the flag_transient transient detection indicator bit is 1, it will be indicated that the current frame is a transient signal, and it is necessary to first perform an inversion processing on the time domain stair effect signal x (ri) for decrease parasitic time domain responses and frequency domain responses. Then, a sequence of zeroes with a range of N / 8 is added at both ends of the signal respectively, the extended signal is divided into 4 subframes that are superimposed on each other and that have the same range. The range of each subframe is N / 2 and the subframes are superimposed on each other by a proportion of 50%. Window management is performed on each of the two intermediate subplots through the use of a sine window with a range of N / 2, and for each of the two subplots on both ends, window management is performed on the inner half of the subframe that uses a half of the sine window with a range of N / 4. Then, the time domain ladder effect processing and the DCTIV transformation are performed on each signal window subframe, to obtain 4 groups of frequency domain coefficients with a range of N / 4 and to constitute the coefficient of frequency domain Y (k), k = 0, ..., N- with a total range of N.
[00125] Furthermore, when the frame range is 20 ms and the sample rate is 32 kHz, N = 640 (the corresponding N can also be calculated in relation to another frame range and another sample rate) .
[00126] In step 103, the point frequency domain coefficients N are divided into several coding subbands and the frequency domain amplitude envelopes (abbreviated amplitude envelopes) of all coding subbands are calculated .
[00127] The division of the frequency domain coefficients into coding sub-bands can be homogeneous or non-homogeneous; and in the present modality, it is inhomogeneous.
[00128] The present step, can be implemented through the use of the following substeps.
[00129] In step 103a, the frequency domain coefficients in the frequency range needed to be encoded, are divided into L subbands (which can be referred to as the coding subbands).
[00130] In the present mode, the frequency range required for coding is 0 ~ 13.6 kHz and the sub-bands can be obtained by the non-homogeneous division, according to the characteristics of human auditory perception. Table 1 to Table 2, respectively, provide a specific division mode, when the transient detection indicator bit Flag_transient is 0 and 1.
[00131] In Table 1 and Table 2, the frequency domain coefficients in the frequency range 0 ~ 13.6 kHz are divided into 30 coding sub-bands, that is, L = 30; and the frequency domain coefficients over 13.6 kHz are set to 0.
[00132] In the present modality, the frequency range of the core layer is additionally obtained by division. When the Transient detection indicator bit Flag_transient is 0, and 1, the subbands numbered 0 ~ 17 in Table 1 and Table 2 are selected as core layer subbands respectively, and the number of the coding subbands the core layer is L_core = 18. The frequency range of the core layer is 0 ~ 7 kHz.
[00133] When the flag_transient transient detection indicator bit is 1, 4 groups of frequency domain coefficients in the frequency range required for coding are divided into sub-bands, and then the frequency domain coefficients in the frequency range of the core layer and the frequency range of the extended layer are rearranged so that their corresponding coding subbands are aligned from low frequencies to high frequencies, respectively. When the frequency domain coefficients remaining in a group are not sufficient to constitute a subband (as in Table 2, less than 16), the frequency domain coefficients with the same frequency, or similar frequencies, in the next group frequency domain coefficients are used for supplementation, such as subbands 16 and 17 of the core layer of Table 2. The coding subbands in Table 2 are a specific result of a completed reorganization.
[00134] It can be understood that the frequency domain coefficients that constitute the coding sub-bands of the core layer are referred to as frequency domain coefficients of the core layer, and the frequency domain coefficients that constitute the sub-bands. coding bands of the extended layer, are referred to as frequency domain coefficients of the extended layer; or it can also be described that the frequency domain coefficients are divided into domain coefficients of the core layer and frequency domain coefficients of the extended layer, the frequency domain coefficients of the core layer are divided into several sub-bands of The core layer encoding and the frequency domain coefficients of the extended layer are divided into several encoding subbands of the extended layer. It can be understood that an order of division of the layer of frequency domain coefficients (referred to as core layer and extended layer) and of division of the coding sub-bands does not influence the implementation of the present invention.
[00135] Table 1 Example of sub-band division when the transient detection indicator bit Flag_transient is 0

[00136] Table 2 Example of sub-band division when the transient detection indicator bit Flag_transient is 1

[00137] In step 103b, the amplitude values of coding subbands are calculated according to the following equation:

[00138] where, LIndex (j) and HIndex (j) represents the index of a starting frequency domain coefficient and the index of a arriving frequency domain coefficient of the jth coding subband respectively, and the specific values are shown in Table 1 (when the transient detection indicator bit Flag_transient is 0) and in Table 2 (when the transient detection indicator bit Flag_transient is 1).
[00139] In step 104, when the transient detection indicator bit Flag_transient is 1, the amplitude values of the encoding subbands of the core layer and coding subbands of the extended layer are quantized and encoded, in order to obtain quantification indexes of amplitude envelopes of the coding subbands of the core layer and coding subbands of the extended layer and the encoded bits of amplitude envelopes of the coding subbands of the core layer and subbands of encoding of the amplified layer, where the encoded bits of amplitude envelopes of the core layer encoding subbands and the encoded bits of amplitude envelopes of the encoding subband of the amplified layer are required to be transmitted to a demultiplexer of bit streams).
[00140] When the flag_transient transient detection indicator bit is 0, the amplitude values of the encoding subbands of the core layer and coding subbands of the extended layer are quantified together; and when the transient detection indicator bit Flag_transient is 1, the amplitude envelope values of the core layer coding subbands and the extended layer coding subbands are quantified separately and the envelope quantification indices amplitude of the coding subbands of the core layer and the quantification indexes of amplitude of the coding subbands of the extended layer are rearranged accordingly.
[00141] The process of quantifying and coding the amplitude envelopes of the coding sub-bands of the core layer is illustrated in the following text.
[00142] The amplitude envelopes of each coding subband is quantified using the following equation (7) to obtain the quantification index of amplitude envelopes of each coding subband, that is, the value of output of each quantifier

[00143] where,

[00144] | _ x J represents the rounding down. Thq (0) is an index of quantification of amplitude envelopes of a first coding subband of the core layer and its range is limited within the scope of [-5 '34]' i.e. ' when Th (0) <-5 'Th (0) = -5 is estimated; and when Thq (0)> 34, Thq (0) = 34 is estimated.
[00145] When the Transient detection indicator bit Flag_transient is 1 'the envelope quantification indices of the core layer coding sub-bands are reorganized' so that the next differential encoding of envelope quantification indices of amplitude of the coding subbands of the core layer has a higher efficiency.
[00146] The specific example of the reorganization is shown in Table 3.
[00147] Table 3 Example of the reorganization of amplitudes of amplitude of the core layer

[00148] The quantification index of Thq amplitude envelopes (0) of the first coding subband is encoded using 6 bits, i.e., consumption of 6 bits.
[00149] The differential operating values between the quantification indices of amplitude of amplitudes of the coding sub-bands of the core layer are calculated using the following equation: Δ Thq (j) = Thq (j +1) - Thq (j ) j = 0, -, L_c ere-2 (8)
[00150] The amplitude envelope can be corrected as follows, to ensure that the rate of ΔTh (j) is within the scope of [-15, 16]:
[00151] if ΔThq (j) <- 15, then it is estimated that Δ Thq (j) = -15, Thq (j) = Thq (j +1) +15, j = L _ core - 2, -, 0 ;
[00152] if ΔThq (j)> 16, then it is estimated that ΔThq (j) = 16, Thq (j + 1) = Thq (j) +16, j = 0, ..., L_core-2;
[00153] Huffman coding is performed on ΔTh (j), j = 0, ..., L_core-2, and the number of bits consumed at the time (referred to as Huffman coded bits) is calculated. If the Huffman encoded bits at the time are greater than or equal to the number of fixedly allocated bits (which are greater than or equal to (L_core - 1) x5) in the present modality), the Huffman encoding mode is not used to encode ΔTh ( j), j = 0, ..., L_core - 2, and the Huffman coding indicator bit is set to Flag_huff_rms_core = 0; otherwise, the Huffman encoding is used to encode ΔTh (j), j = 0, ..., L_core — 2, and the Huffman encoding indicator bit is set to Flag_huff _rms_core = 1. The encoded bits of the indices of quantification of amplitude envelopes of the encoding subbands of the core layer (ie the encoded bits of amplitude differential values and an amplitude envelope of the first subband) and the Huffman encoding indicator bit are required to be transmitted to the MUX.
[00154] The process of quantifying and coding the amplitude surroundings of the coding sub-bands of the extended layer will be illustrated in the text that follows.
[00155] When the transient detection indicator bit Flag_transient is 0, Huffman encoding is performed on the differential values of amplitude envelopes ΔTh (j), j = L _ core —1, ..., L —2, and the number of bits consumed at the time (referred to as Huffman coded bits) is calculated. If the Huffman encoded bits at the time are greater than or equal to the number of fixedly allocated bits (which are greater than or equal to (L - L_core) x5 in the present modality), the Huffman encoding mode is not used to encode ΔTh (j ), j = L _ core —1, ..., L —2, and the Huffman coding indicator bit is set to Flag_huff_rms_ext = 0; otherwise, Huffman encoding is used to encode ΔTh (j), j = L _ core —1, ..., L —2, and Huffman's encoding indicator bit is set to Flag_huff _rms_ ext = 1.
[00156] When the flag_transient transient detection indicator bit is 1, the amplitude envelopes of the coding sub-bands of the extended layer are quantized according to the following equation, to obtain the quantification indices of amplitude of the sub amplitude - coding bands of the extended layer, ie the quantifier output values: Thq (j) = | _2log2 Th (j) J j = L _ core, ---, L ~ (9)
[00157] where, Thq (L_core) is an index of quantification of amplitude envelopes of a first coding subband comprised of frequency domain coefficients of the extended layer, and the range of the same is limited within the scope of [- 5, 34]. The amplitude wrapping quantification indices of the extended layer coding sub-bands are reorganized, so that the next differential coding of the amplitude wrapping quantification indices of the extended layer coding sub-bands is more efficient. The specific example of reorganization is shown in Table 4.
[00158] Table 4 Example of the reorganization of amplitude envelopes of coding sub-bands of the extended layer

[00159] The quantization index of Thq amplitude envelopes (L_core) of the first coding subband, comprised by frequency domain coefficients of the extended layer, is encoded using 6 bits, i.e., consumption of 6 bits. The differential operating values between the quantification indexes of amplitude envelopes of the coding sub-bands of the extended layer, comprised by frequency domain coefficients of the extended layer, are calculated using the following equation: Δ Thq (j) = Thq ( j +1) - Thq (j) j = L _ core, ---, L-2 (10)
[00160] The amplitude envelope can be corrected as follows, to ensure that the rate of ΔTh (j) is within the scope of [-15, 16]:
[00161] if ΔThq (j) <-15, ΔThq (j) = -15 is estimated, Thq (j) = Thq (j +1) +15, j = L_core, -, L-2; and if ΔThq (j)> 16, it is estimated ΔThq (j) = 16, Thq (j +1) = Thq (j) +16, j = L_core, ---, L-2. Then, the Huffmané coding executed in ΔTh (j), j = L_core, ---, L-2, and the number of bits consumed at the time (referred to as Huffman coded bits) is calculated. If the Huffman encoded bits at the time are greater than or equal to the number of fixedly allocated bits (which are greater than or equal to (L - L_core-1) x5 in the present modality), the Huffman encoding mode is not used to encode Δ Th (j), j = L _ core, -, L-2, and Huffman's coding indicator bit is set to Flag_huff_rms_ext = 0; otherwise, Huffman encoding is used to encode Δ Th (j), j = L _ core, ---, L-2, and Huffman's encoding indicator bit is set to Flag_huff _rms_ ext = 1.
[00162] The encoded bits of the amplitude envelope quantification indices and the Huffman encoding indicator bit of the extended layer are necessary to be transmitted to the MUX.
[00163] In step 105, the initial importance values of the core layer coding subbands are calculated according to the rate distortion theory and amplitude envelope information of the core layer coding subbands and then, the bit allocation of the core layer is performed according to the importance of the coding subbands of the core layer.
[00164] The present step, can be implemented through the following sub-steps.
[00165] In step 105a, an average bit consumption value of a simple frequency domain coefficient of the core layer is calculated.
[00166] The number of bits_available_core bits used for the encoding of the core layer is taken from the total number of bits_available bits that can be provided by a frame ms of 20 ms, and the remaining number of bits_left_coredisible for encoding the coefficients of frequency domain of the core layer can be obtained by removing the number of bits bit_sides_core consumed by the secondary information of the core layer and the number of bits bits_Th_core consumed by the amplitude quantification indices of the encoding sub-bands of the core layer , ie:
[00167] bitsleftcore = bitsavailablecore - bit_sides_core - bits_Th_core (11)
[00168] The secondary information comprises bits of indicators of the Huffman encoding Flag_huff_rms_core, Flag_huff_PLVQ_core and iterative times count_core. Flag_huff_rms_core is used to identify whether Huffman coding is used for the quantification indices of amplitude envelopes of the sub-bands of the core layer; Flag_huff_PLVQ_core is used to identify whether Huffman encoding is used when vector encoding is performed on the frequency domain coefficients of the core layer, and the count_coresion iterative times used to identify the iterative times at which the core layer bit allocation is corrected (see description in the following detailed steps).
[00169] The average bit consumption value of the simple frequency domain coefficient of the core layer is calculated as R core:

[00170] where, L_core is the number of the coding subbands of the core layer.
[00171] In step 105b, an ideal bit value under a condition of a maximum quantized signal for noise ratio gain is calculated according to the bit rate distortion theory.
[00172] The ideal bit value under the condition of a quantized signal maximum quantized signal for noise ratio gain of each coding subband, within the limits of the bit rate distortion degree can be calculated and obtained through optimization the degree of bit rate distortion, based on an independent Gaussian random variable using the Lagrange method, such as: rr _core (j) = [R _core + Rlin _core (j)], j = 0, --- , L_eer - 1 (13)
[00173] where,

[00174] and

[00175] In step 105c, the initial value of importance, when the bit allocation is performed for the coding subbands of the core layer, is calculated.
[00176] With the ideal bit value mentioned above, and with a proportion factor in accordance with the characteristics of the auditory perception, the initial value of the importance of the coding sub-bands of the core layer for the control of the bit allocation in the effective bit allocation can be obtained: rk (j) = ax rr _ core (j) = a [R _ core + Rmin _ core (j)], j = 0, ---, L_ eore -1 (16)
[00177] where, a is a proportion factor that is related to the encoded bit rate and can be obtained through statistical analysis, normally 0 <to <1, and in the present modality, the value of a is 0, 7; and rk (j) represents the importance of the coding subband jth, when executing bit allocation.
[00178] In step 105d, the bit allocation of the core layer is performed according to the importance of the coding subbands of the core layer. The specific description is as follows.
[00179] First, a coding subband of the core layer, where a maximum value is located, is searched from several rk (j), and it is assumed that the number of coding subband is jk, then , the number of region _ bit (j) bit allocation of each frequency domain coefficient is added to the core layer coding subband and the importance of the core layer coding subband is reduced; however, a total number of bits bit _ band _used (j) consumed by the coding subband is calculated; finally a sum of the number of bits consumed by all coding sub-bands of the core layer sum (bit_band_used (j)), j = 0,., L_core-1 is calculated; and the process mentioned above is repeated until the sum of the number of bits consumed satisfies a maximum value under a condition of limited bits that can be supplied.
[00180] The bit allocation method in the present step can be represented by the following pseudocodes:
[00181] estimate region_bit (j) = 0, j = 0.1, er- - 1;
[00182] for coding subbands 0.1, ..., L_core-1: {
[00183] Search for jk = argmax [rk (j)];
[00184] estimate region_bit (jk) <threshold of classification {
[00185] if region_bit (jk) = 0
[00186] estimate region_bit (jk) = region_bit (jk) + 1;
[00187] calculate bit_band_used (jk) = region_bit (jk) * BandWidth (jk);
[00188] estimate rk (jk) = rk (jk) - 1;
[00189] or else, if region_bit (jk)> = 1
[00190] estimate region_bit (jk)) = region_bit (jk) + 0.5;
[00191] calculate bit_band_used (jk) = region_bit (jk) * BandWidth (jk) * 0.5;
[00192] estimate rk (jk) = rk (jk) - 0.5; }
[00193] or else, if region_bit (jk)> = classification threshold {
[00194] estimate region_bit (jk) = region_bit (jk) + 1;
[00195] estimate

[00196] estimate bit_band_used (jk) = region_bit (jk) xBandwidth (jk); }
[00197] calculate bit_used_all = sum (bit_band_used (j)) j = 0.1, ..., L_core-1;
[00198] if bit_used_all <bits_left_core - 16, go back and search again for jk in several coding sub-bands and calculate in a circular way the number of bit allocation (or designated as the number of encoded bits); where, 16 is a maximum of the number of bits of the core layer encoding subbands.
[00199] otherwise, end the cycle, calculate the bit allocation number and output the current bit allocation number. }
[00200] Finally, according to the importance of the sub-bands, the remaining bits that are less than 16 are allocated to the coding sub-bands of the core layer that comply with the requirements, according to the following principle: 0, 5 bits are allocated to each frequency domain coefficient in the core layer coding subbands, where the bit allocation is 1, and meanwhile the importance of the core layer coding subbands is reduced by 0, 5 until bit_left_core - bit_used_all <8, and the bit allocation ends. At that time, the bits that finally remain are kept as remaining bits remain_bits_core initially allocated by the core layer.
[00201] The range of values of the classification threshold mentioned above is greater than or equal to 2 and less than or equal to 8, and the value can be 5 in the present modality.
[00202] Wherein, MaxBit is a maximum bit allocation number that can be allocated for a simple frequency domain coefficient in the coding subband in the core layer coding subband, and the unit is bit / coefficient frequency domain. In the present mode, MaxBit = 9 is used. This value can be changed accordingly according to the codec bit rate of the codec. region_bit (j) is the number of bits allocated to a single frequency domain coefficient in the coding subband of the core layer jth, ie, it is the bit allocation number of the simple frequency domain coefficient in that subband .
[00203] In addition, in this step, the bit allocation of the core layer can be performed using Thq (j) or | _ ^ xlog2 [Th (j)] + vJ as an initial value of importance of bit allocation of the core layer coding subband, where j = 0, ..., L_core - 1; μ> 0.
[00204] The coding subbands described in the following steps 106-107 are coding subbands of the core layer.
[00205] In step 106, the calculation of the normalization in the frequency domain coefficients is performed in the coding sub-bands of the core layer by using the values of quantified amplitude envelopes reconstructed according to the quantification indices of envelope envelopes. amplitude of the coding sub-bands of the core layer, and then the normalized frequency domain coefficients are grouped together to form several vectors.
[00206] For all j = 0,., L_core - 1, the normalization process is performed on all coefficients of frequency domain Xj in the coding subband using the quantified amplitude envelope 2Thq (j) / 2nd coding subband j:

[00207] 8 continuous coefficients in the coding subband are grouped together to constitute a vector with 8 dimensions. According to the division of the coding sub-bands in Table 1, the coefficients in the coding sub-band j can only be grouped so as to constitute 8-dimension vectors Lattice_D8 (j). The various normalized and grouped 8-dimension vectors to be quantified can be represented as Ym, where, m represents a position where this 8-dimension vector is located in the coding subband, and its range is between 0 and Lattice_D8 ( j) -1.
[00208] In step 107, for all j = 0, ..., L_core-1, the size of the number of bits region_bit (j) allocated to the coding subband already evaluated, and if the allocated number of bits region_bit (j) is below the classification threshold, the coding subband is referred to as the reduced bit coding subband, and the vectors to be quantized in the reduced bit coding subband are quantized and coded using using the method of quantifying structure vectors by pyramid cells; and if the allocated number of bits region_bit (j) is greater than or equal to the threshold, the coding subband is referred to as the high bit coding subband, and the vectors to be quantized in the coding subband of high bits are quantized and encoded using the spherical cell structure vector quantification method; and the threshold of the present modality is 5 bits.
[00209] The quantification of structure vectors by pyramid cells and coding method will be illustrated below.
[00210] The reduced-bit coding subband is quantified using the pyramid cell structure vector quantification method, and at that time, the number of bits allocated to the subband j complies with: 1 < = region_bit (j) <5.
[00211] The present invention uses a quantification of cell structure vectors with 8 dimensions based on grid points D8 where the grid points D8 are defined as follows:

[00212] where, Z8 represents an integer space of 8 dimensions. The basic method of mapping (quantifying) the vectors at 8 dimensions to grid points D8 is described as follows:
[00213] Assuming x is a random real number, f (x) represents rounding quantification to take an integer that is close to x in both adjacent integers x, and w (x) represents rounding quantification to take a integer that is furthest from x in both integers adjacent to x. For any vector X = (x, x2, ..., x8) eR8, f (X) = (f (x), f (x2), ..., f (x8)) can also be defined. In f (X), a minimum subscript in the components with an absolute maximum of quantification errors due to rounding is selected, and is saved as k, thereby defining g (X) = (f (x), f (x) ,. ..w (x), ..., f (x)), and so there is one and only one value which is the value of the grid point D8 in f (X) or g (X), and at that time, the quantification value of the D8 grid value output by the quantifier is:

[00214] The specific steps of the vector quantization method to be quantified for the D8 grid points and the resolution of the D8 grid points indices is as follows.
[00215] a, the energy of the vectors to be quantified is regularized.
[00216] The energy of the vectors to be quantified needs to be regularized before quantification. The serial number of the index codebook and the energy scale factors corresponding to the number of bits are requested from Table 2 according to the number of bits region_bit (j) allocated to the coding subband j in which vectors to be quantified are located; and then the energy of the vectors to be quantized is regularized according to the following equation: cale = V- - a) * scale (index) (20)
[00217] where, Ym represents the normalized 8-dimension vector mth to be quantized in the coding subband j, ^ ep represents an 8-dimension vector after the regularization of the energy of Ym, and a = (2-6.2 -6.2-6.2-6.2-6.2-6.2-6.2-6).
[00218] Table 5 Corresponding relationship between the number of bits of the quantification of structure vectors by pyramid cells and the serial number of the codebook, energy scale factor, maximum radius of the pyramid surface energy

[00219] b, the regularized vectors perform the grid point quantification;
[00220] The 8-dimension vector 7 ™ ^ whose energy is quantified for the D8 7 ™ grid point:

[00221] where, f (•) represents a quantization operator for mapping a given 8-dimension vector to the grid points D8.
[00222] c, the energy of is removed according to the energy of the pyramid surface of the D8 Y ™ grid point.
[00223] The energy of grid point D 8 is calculated and compared with a maximum radius of surface energy in LargeK pyramid (index) in the coding code book. If it does not exceed the maximum energy radius of the pyramid surface, the grid point index in the code book is calculated; otherwise, the energy of the regularized vector K ”, to be j, sca and quantized from the coding subband, is removed, until the energy of the quantized grid point of the vector to be quantized, whose energy has been removed, is not greater than the maximum energy radius of the pyramid surface; at that time, a small portion of its own energy is persistently increased for the vector to be quantified, whose energy has been removed, until its energy to be quantified for grid point D8 exceeds the maximum energy radius on the pyramid surface; and a last grid point D8, whose energy does not exceed the maximum energy radius on the pyramid surface, is selected as the quantization value of the vector to be quantified. The specific process can be described using the following pseudocodes.
[00224] the pyramid surface energy of} Ç ”is calculated, ie, a sum of absolution of various components of the mth vector in the coding subband j is obtained, temp _ K = sum |) Ybak = Y ™ Kbak = temp _ K
[00225] If temp_K> LargeK (index) {
[00226] While temp_K> LargeK (index) {

[00227] While temp_K <= LargeK (index) {Ybak = Y ™ Kbak = temp _ K Ym, = Ym, + w js scale js scale Ym = (Ym, 7 JJ a, scale temp _ K = sum (Y ” ) Y ™ = Ybak temp _ K = Kbak
[00228] At that time, YJ ”is the last grid point D8 whose energy does not exceed the maximum energy radius of the pyramid surface and temp_K is the energy of that grid point.
[00229] d, quantification indices of the D 8 T ™ grid points are generated in the code book.
[00230] According to the following steps, the indices of the grid points D8 fj "in the code book are obtained by calculation. The specific steps are as follows.
[00231] In step one, the grid points on various pyramid surfaces are labeled respectively according to the energy dimension of the pyramid surface.
[00232] For a grid with integer grid points ZLwith the dimension of L, a pyramid surface with an energy radius of K is defined as:

[00233] N (L, K) is registered as the number of grid points in 5 (L, K), and for the integer grid ZL, the recursion relation for N (L, K) is as if follows: N (L, 0) = 1 (L> 0), N (0, K) = 0 (K> 1) N (L, K) = N (L -1, K) + N (L -1 , K -1) + N (L, K -1) (L> 1, K> 1)
[00234] for the integer grid point Y = (y, y, ..., y) eZL on the pyramid surface with an energy radius of K, it is identified with a certain number b in [0.1, ..., N (L, K) -1], and b is referred to as the grid point label. The resolution step for label b is as follows.
[00235] In step 1.1, the estimate of b = 0, i = 1, k = K, l = L, N (m, n), (m <= L, n <= K) is calculated according to recursion formula above. It is defined:

[00236] In step 1.2, if yi = 0, then b = b + 0;
[00237] if | yi | = 1, then

[00238] if | yi |> 1, then

[00239] In step 1.3, k = k- | y |, l = l-1, i = i + 1, and if k = 0 at that time, then the search is interrupted, and b is the Y tag; otherwise, continue with step 1.2.
[00240] In step 2, the grid points on all pyramid surfaces are labeled together.
[00241] The labels for each grid point on all pyramid surfaces are calculated according to the number of grid points on various pyramid surfaces and the label for each grid point on the respective pyramid surface:

[00242] where, kk is an even number. At that time, index_b (j, m) is an index of the grid point D8 Y ™ in the codebook, that is, the index of the 8-dimensional vector mth in the coding subband j.
[00243] e, steps a ~ d are repeated until several vectors of 8 dimensions of all coding sub-bands, in which the coded bits are greater than 0, complete the generation of the index.
[00244] f, the vector quantification index index_b (j, k) of each 8-dimension vector in each coding subband is obtained according to the pyramid cell structure vector quantification method, where k represents the 8-dimensional vector kth of the coding subband j, and Hufffman coding is performed on the quantification index index_b (j, k) under the various conditions that follow.
[00245] 1) In all sub-bands, in which the number of bits allocated to the single frequency domain coefficient is greater than 1, and less than 5, except for 2, each 4 bits in the natural binary code of each vector quantification index are formed in a group and are executed with Huffman coding.
[00246] 2) In all coding sub-bands, in which the number of bits allocated for the simple frequency domain coefficient is 2, the quantification index of the pyramid cell structure vector of each 8-dimension vector is encoded using 15 bits. In the 15 bits, Huffman coding is performed in 3 groups of 4 bits and 1 group of 3 bits respectively. Therefore, in all coding subbands, in which the number of bits allocated for the single frequency domain coefficient is 2, 1 bit is stored for the coding of each 8-dimension vector.
[00247] 3) When the number of bits allocated to the coding subband simple frequency domain coefficient is 1, if the quantization index is less than 127, 7 bits are used to encode the quantization index, and the 7 bits are divided into 1 group of 3 bits and 1 group of 4 bits, and Huffman coding is performed in the two groups respectively; if the quantization index is 127, a value of your natural binary code is "1111 1110" and the previous seven "1" s are divided into 1 group of 3 bits and 1 group of 4 bits, and Huffman encoding is performed in the two groups respectively; and if the quantization index is equal to 128, a value of your natural binary code is "1111 1111" and the previous seven "1" s are divided into 1 group of 3 bits and 1 group of 4 bits and Huffman encoding is performed in both groups respectively.
[00248] The method of performing Huffman coding in the quantification index can be described by the following spseudocodes:
[00249] in all coding sub-bands of region_bit (j) = 1.5 and 2 <region_bit (j) <5 {
[00250] n seen against the range of [0, region_bit (j) x8 / 4 - 1], it is increased in unit range of 1, and the following cycle is executed: {
[00251] index_b (j, k) is changed to correct in 4 * n bits;
[00252] calculate 4 reduced bits tmp of index_b (j, k), that is, tmp = e (index_b (j, k), 15)
[00253] calculate the tmp code word in the code book and the number of bits consumed;
[00254] plvq_codebook (j, k) = plvq_code (tmp + 1);
[00255] plvq_count (j, k) = plvq_bit_count (tmp + 1);
[00256] where, plvq_codebook (j, k) and plvq_count (j, k) are the codeword and number of bits consumed in the Huffman coding code book of the 8-dimension vector kthda subband j; and plvq_bit_count and plvq_codes are sought according to Table 6.
[00257] The total number of bits consumed after using Huffman encoding is updated:
[00258] bit_used_huff_all = bit_used_huff_all + plvq_bit_count (tmp + 1); }}
[00259] in coding the subband of region_bit (j) = 2, {
[00260] n is in the range of [0, region_bit (j) x8 / 4-2], is increased in unit range of 1, and the following cycle is executed: {
[00261] index_b (j, k) is changed to correct in 4 * n bits;
[00262] calculate 4 reduced tmp bits of index_b (j, k), that is, tmp = e (index_b (j, k), 15)
[00263] calculate the tmp code word in the code book and its consumption;
[00264] plvq_count (j, k) = plvq_bit_count (tmp + 1);
[00265] plvq_codebook (j, k) = plvq_code (tmp + 1);
[00266] where, plvq_count (j, k) and plvq_codebook (j, k) are the Huffman bit consumption number and the code word of the 8-dimensional vector kth of subband j respectively; and plvq_bit_count and plvq_codes are sought according to Table 6.
[00267] The total number of bits consumed after using Huffman encoding is updated:
[00268] bit_used_huff_all = bit_used_huff_all + plvq_bit_count (tmp + 1); } {
[00269] It is necessary to process a 3-bit condition, then:
[00270] after index_b (j, k) has been changed to correct by [region_bit (j) * 8 / 4-2] * 4bits;
[00271] calculate 3 reduced bits tmp of index_b (j, k), that is, tmp = e (index_b (j, k), 7)
[00272] calculate the tmp code word in the code book and its consumption;
[00273] plvq_count (j, k) = plvq_bit_count _r2_3 (tmp + 1);
[00274] plvq_codebook (j, k) = plvq_code _r2_3 (tmp + 1);
[00275] where, plvq_count (j, k) and plvq_codebook (j, k) are the Huffman bit consumption number and the code word of the 8-dimensional vector kth of subband j respectively; and plvq_bit_count_r2_3 and plvq_code_r2_3 are searched according to Table 7.
[00276] The total number of bits consumed after using Huffman encoding is updated:
[00277] bit_used_huff_all = bit_used_huff_all + plvq_bit_count (tmp + 1); }}
[00278] in the subband coding of region_bit (j) = 1, {
[00279] if index_b (j, k) <127 {{
[00280] calculate 4 reduced tmp bits of index_b (j, k), that is, tmp = e (index_b (j, k), 15)
[00281] calculate the tmp code word in the code book and its consumption;
[00282] plvq_count (j, k) = plvq_bit_count _r1_4 (tmp + 1);
[00283] plvq_codebook (j, k) = plvq_code _r1_4 (tmp + 1);
[00284] where, plvq_count (j, k) and plvq_codebook (j, k) are the Huffman bit consumption number and the code word of the 8-dimension vector kth of subband j respectively; and plvq_bit_count_r1_4 and plvq_code_r1_4 are sought according to Table 8.
[00285] The total number of bits consumed after using Huffman encoding is updated:
[00286] bit_used_huff_all = bit_used_huff_all + plvq_bit_count (tmp + 1); } {
[00287] It is necessary to process a 3-bit condition, then:
[00288] index_b (j, k) is changed to correct in 4 bits;
[00289] calculate 3 reduced tmp bits of index_b (j, k), that is, tmp = e (index_b (j, k), 7)
[00290] calculate the tmp code word in the code book and its consumption:
[00291] plvq_count (j, k) = plvq_bit_count _r1_3 (tmp + 1);
[00292] plvq_codebook (j, k) = plvq_code _r1_3 (tmp + 1);
[00293] where, plvq_count (j, k) and plvq_codebook (j, k) are the Huffman bit consumption and the 8-dimensional vector code word kth of subband j respectively; and plvq_bit_count_r1_3 and plvq_code_r1_3 codebooks are sought according to Table 9.
[00294] The total number of bits consumed after using Huffman encoding is updated:
[00295] bit_used_huff_all = bit_used_huff_all + plvq_bit_count (tmp + 1); }}
[00296] if index_b (j, k) = 127
[00297] {one of its binary values is “1111 1110”
[00298] the Huffman code tables in Table 9 and Table 8 are searched for the previous three "1" and last four "1" respectively, the calculation method is the same as the previous condition of index_b (j, k) <127.
[00299] The total number of the bit consumed after using Huffman coding is updated: a total of 8 bits is required. }
[00300] if index_b (j, k) = 128
[00301] {one of its binary values is “1111 1111”
[00302] the Huffman code tables in Table 7 and Table 6 are searched for the previous three "1" and the last four "1" respectively, and the calculation method is the same as the previous condition of index_b (j, k ) <127.
[00303] The total number of the bit consumed after using Huffman coding is updated: a total of 8 bits is required. }}
[00304] Therefore, in all coding sub-bands, in which the number of bits allocated for the single frequency domain coefficient is 1, 1 bit is stored for the coding of each 8-dimension vector when index_b (j , k) <127.
[00305] Table 6 Huffman code table for vector quantification in pyramid

[00306] Table 7 Huffman code table for vector quantification in pyramid

[00307] Table 8 Huffman code table for vector quantification in pyramid

[00308] Table 9 Huffman code table for vector quantification in pyramid

[00309] g: it is evaluated whether the Huffman encoding stores the bits.
[00310] A set of all reduced bit coding subbands is registered as C, and the bits saved by all coding subbands, in which the number of bits allocated for the single frequency domain coefficient is 1, or 2, as described in 2) and 3), in step f above, are calculated and recorded as the number of absolutely saved bits bit_saved_r1_r2_all_core, and the total number of bits bit_used_huff _all consumed after Huffman encoding is performed in the quantized vector indices of the 8-dimension vectors belonging to all C-coding sub-bands are calculated; bit_used_huff _allé compared to the total number of bit_used_nohuff_all of the bits consumed by natural encoding, and if bit_used_huff _all <bit_used_nohuff_all, the quantized vector indices after Huffman encoding are transmitted, and meanwhile, the Huffman encoding indicator Flag_huff_PLVQ_core is set to 1; otherwise, natural encoding is performed directly on the quantized vector indices, and Huffman's flag_huff_PLVQ_core encoding is set to 0.
[00311] The bit_used_nohuff_all above is equal to a difference by the total sum number (bit_band_used (j), j andC) of the number of bits allocated for all C coding subbands minus bit_saved_r1_r2_all.
[00312] h: the bit allocation number is corrected.
[00313] If the Huffman encoding indicator Flag_huff_PLVQ_core is 0, the bit allocation of the encoding sub-bands is corrected using the number of bits remaining from the initial allocation remain_bits_core and the number of absolutely saved bits bit_saved_r1_r2_all_core. If the Huffman encoding indicator Flag_huff_PLVQ_core is 1, the bit allocation of the encoding sub-bands is corrected by using the number of bits remaining from the initial allocation remain_bits_core, the number of absolutely saved bits bit_saved_r1_r2_all_core and the bits saved by encoding Huffman's.
[00314] The quantification of structure vectors by spherical cells and coding method will be illustrated below.
[00315] The sub-bands of high bit coding are quantified using the spherical cell structure vector quantification method, and at that time, the number of bits allocated to the sub-band j complies with 5 <= region_bit (j) <= 9.
[00316] Here, we also use the quantification of 8-dimension grid vectres based on the D8 grid.
[00317] a, the energy of the normalized vector mthYm to be quantified from the coding subband is regulated according to the number of bits region_bit (j) allocated to a single frequency domain coefficient in the coding subband j as if Follow:

[00318] where, a = (2-6.2-6.2-6.2-6.2-6.2-6.2-6.2-6),

[00319] while scale (region _ bit (j)) represents an energy scale factor, when the number of bit allocations of the single frequency domain coefficient in the coding subband is region _ bit (j), and the corresponding list can be found in accordance with Table 10.
[00320] Table 10 Corresponding relationship between the number of bit allocations of the spherical grid vector quantification and energy scale factor

[00321] b, D8 grid point index vectors are generated. m
[00322] The mthYm vector to be quantified after being executed with energy scale in the coding subband, is mapped at the grid point of D8:

[00323] It is evaluated whether f (Ym / 2 “!” Bn ~ j, t (j)) is a zero vector, ie, if several components of it are all zeros, and if f ^^ ™ Hreglbn-j) t (j)) is a zero vector, it will be referred to as fulfilling the zero vector condition; otherwise, it will be referred to as not meeting the zero vector condition.
[00324] If the condition of zero vector is fulfilled, the index vector can be obtained by the following equation of generation of index vectors:

[00325] The index vector k of grid point D8} Ç ”has its output at the time when, G is a generation matrix of grid point D8, and the shape is as follows:

[00326] If the zero vector condition is not fulfilled, the value of the vector Ym is divided by 2, until the zero vector condition f ^ tf ™ / 2reglbitj (j)) is satisfied; and the value of the small multiple of Ym itself is supported as w, and then the reduced vector Ym is added to the supported value of the small multiple w, and is then quantified to the grid point D8 to assess whether the zero vector condition is fulfilled; if the zero vector condition is not met, a k-index vector of grid point D8 that approximately meets the zero-vector condition is obtained according to the index vector calculation equation, otherwise, the vector Y continues to add the supported value of the small multiple w, and then quantify it for grid point D8 until the zero vector condition is met; and finally, the index vector k of grid point D8 that approximately meets the condition of zero vector is obtained according to the equation of calculation of the index vector; and the index vector k of grid point D8 Y is output. Such a process can also be described using the following pseudocodes.

[00327] c, The vector quantification indices of the coding subbands with high bits are coded and at that time, the number of bits allocated to the subband j complies with 5 <= region_bit (j) <= 9.
[00328] According to the method of quantifying structure vectors by spherical cells, the vector of 8 dimensions in the coding sub-bands, in which the number of bits allocation is from 5 to 9, is quantified to obtain the vector index k = {k1, k2, k3, k4, k5, k6, k7, k8}, and natural encoding is performed on various components of the k vector index according to the number of bits allocated for the domain coefficient frequency to obtain the encoded bits of the vector.
[00329] As shown in FIG. 3, the bit allocation correction process specifically comprises the following steps.
[00330] In step 301, the number of diff_bit_count_cored bits available for the bit allocation correction is calculated. If the Huffman flag_huff_PLVQ_core coding indicator is 0, then
[00331] diff_bit_count_core = remain_bits_core + bit_saved_r1_r2_all_core;
[00332] if the Huffman flag_huff_PLVQ_core coding indicator is 1, then
[00333] diff_bit_count_core = remain_bits_core + bit_saved_r1_r2_all_core + (bit_used_nohuff_all-bit_used_huff_all).
[00334] Estimating count = 0:
[00335] in step 302, if diff_bit_count_core is greater than 0, then a maximum value rk (jk) is searched for in all rk (j) (j = 0, ..., L_core-1), which is represented by an equation like:

[00336] In step 303, it is evaluated whether region_bit (jk) +1 is less than or equal to 9, and if region_bit (jk) +1 is less than or equal to 9, the next step is performed; otherwise, the importance of the corresponding coding subband for jk is set to the lowest level (for example, estimating rk (jk) = - 100), which indicates that there is no need to correct the allocation number of bits of that encoding subband, and skip to step 302.
[00337] In step 304, it is evaluated whether diff_bit_count_core is greater than or equal to the necessary bits to be consumed by correcting the bit allocation number of the jk encoding subband (if Flag_huff_PLVQ_core is 0, it is calculated according to the encoding natural; and if Flag_huff_PLVQ_core is 1, it is calculated according to Huffman coding), and if so, step 305 is performed, the bit allocation number region _ bit (j) of the coding subband jk is corrected, the importance value of rk (jk) of the subband is reduced, the quantification of vectors and natural coding or Huffman coding is performed again in the coding subband jk, and finally the value of diff_bit_count_core is updated; otherwise, the bit allocation correction process ends.
[00338] In step 305, in the bit allocation correction process, 1 bit is allocated to an encoding subband, in which a bit allocation number is 0, and the importance after the bit allocation is reduced by 1, 0.5 bit is allocated to an encoding subband, in which the number of bit allocation is greater than 0 and less than 5, and the importance after the allocation of bits is reduced by 0.5; and 1 bit is allocated to the coding subband with a bit allocation number greater than 5, and the importance after the bit allocation is reduced by 1.
[00339] In step 306, the estimate of count = count + 1 is adjusted as count is less than or equal to Maxcount, and if count is less than or equal to Maxcount, it skips to step 302; otherwise, the bit allocation correction process ends.
[00340] The Maxcount above is an upper limit on the number of times of arc iteration, which is determined according to the encoded bit rate and sample rate. In the present mode, if the Huffman coding indicator Flag_huff_PLVQ is 0, then Maxcount = 7 is used; and if the Huffman flag_huff_PLVQ coding indicator is 1, then Maxcount = 31 is used.
[00341] In step 108, inverse quantization is performed on the frequency domain coefficients described above in the core layer, which is performed with vector quantization, and a differential calculation is performed between the inversely quantized frequency domain coefficients and the original frequency domain coefficients obtained after being performed with the time frequency transformation to obtain residual signals from the core layer, and the encoding signals from the extended layer are constituted by using the residual signals from the core layer and coefficients frequency domain of the extended layer.
[00342] It can be understood that the step of constituting the encoded signals of the extended layer (step 108) can also be performed after the completion of the bit allocations of the encoded signals of the extended layer (step 110).
[00343] In step 109, the division of the sub-bands is performed on the residual signals of the core layer that are the same as the frequency domain coefficients, and the quantification indices of amplitude envelopes of the sub-bands of the signal coding core layer residuals are calculated according to the quantification indices of amplitude of the encoding subbands of the core layer and bit allocation numbers of the core layer (ie several region_bit (j), j = 0, ..., L_core-1).
[00344] The present step, can be implemented through the following substeps.
[00345] In step 109a, a statistical table of correction values of the quantification indices of amplitude of amplitudes of the residual signals of the core layer is sought according to the number of bits region_bit (j), j = 0,., L_core -1 allocated to the simple frequency domain coefficient in the coding sub-bands of the core layer, to obtain the correction values diff (region_bit (j)), j = 0,., L_core-1 of the quantification indices amplitude surrounds of the residual signals from the core layer;
[00346] where, region_bit (j) = 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, j = 0,., L_core -1, while the correction values of the amplitude envelope quantification indices can be defined according to the following rule: ^ diff (region_bit (j))> 0; and ^ when region_bit (j)> 0, diff (region_bit (j)) does not decrease when the value of region_bit (j) increases.
[00347] In order to obtain a better encoding and decoding effect, a statistic can be performed on the quantification indices of the encoding subband amplitude envelopes, which are calculated under various bit allocation numbers (region_bit (j) ) and the sub-band amplitude quantification indices that are calculated directly from residual signals, to obtain the statistical table of correction values of the amplitude envelope quantification indices with the maximum probability, as shown in Table 11:
[00348] Table 11 Statistical table of correction values for quantification indices of amplitude surroundings

[00349] In step 109b, the amplitude quantification index of the sub-band jth of the residual signal of the core layer is calculated according to the amplitude quantification index of the coding sub-band in the layer nucleus and the correction value of the quantification index in Table 8:
[00350] Thq '(j) = Thq (j) -diff (region_bit (j)), j = 0, ..., L_core-1,
[00351] where, Th (j) is the quantification index of amplitude envelopes of the coding subband in the core layer.
[00352] It should be understood that when a given bit allocation number of a given coding subband in the core layer is 0, there is no need to correct the amplitude envelopes of the residual signal coding subband of the core layer, and at that time, the amplitude envelopes value of the residual signal of the core layer is the same as the amplitude envelopes value of the coding subband of the core layer.
[00353] In addition, when a bit allocation number of a given coding subband in the core layer is region_bit (j) = 9, the value of quantified amplitude envelopes of the residual signal coding subband jth of the sub-band encoding the residual signal of the core layer is set to zero.
[00354] In step 110, bit allocation is performed in the encoding subbands of the encoding signals of the extended layer in the extended layer.
[00355] The subband division of the extended layer is determined by Table 1 or Table 2. The coding signals in the subbands 0, ..., L_core-1 are the residual signals from the core layer and the coding in L_core, ..., L-1 are the frequency domain coefficients in the coding sub-bands of the extended layer. Subbands 0 to L-1 are also referred to as the coding subbands of the extended layer coding signals.
[00356] According to the amplitude envelope quantification indexes calculated from the residual signals of the core layer, the amplitude envelope quantification indices of the coding subbands of the extended layer and the number of bits available for the extended layer , initial values of importance of the coding subbands of the coding signals of the extended layer are calculated within the full frequency range of the extended layer using the bit allocation solution that is the same as that of the core layer, and bit allocation is performed in the coding subbands of the encoded signals of the extended layer.
[00357] In the present mode, the frequency range of the extended layer is 0 ~ 13.6 kHz. The total bit rate of the audio stream is 64 kbps, the bit rate of the core layer is 32 kbps, and then the maximum bit rate of the extended layer is 64 kbps. The total available number of bits in the extended layer is calculated according to the bit rate of the core layer and the maximum bit rate of the extended layer, and then the bit allocation is performed, until the bits have been completely consumed.
[00358] In step 111, normalization, vector quantization and encoding are performed on the encoding signals of the extended layer, according to the quantification indices of amplitude of the encoding sub-bands of encoding signals of the extended layer and corresponding bit allocation numbers to obtain encoded bits of the encoding signals. Where, the vector constitution, the vector quantification method and the encoding method of the encoding signals in the extended layer are the same as those of the frequency domain coefficients in the core layer, respectively.
[00359] In step 112, the hierarchical stream of encoded bits is constituted and the bit rate layers are constituted according to the value of the bit rate.
[00360] As shown in FIG. 4, the hierarchical flow of encoded bits consists of using the following way: first, the secondary information of the core layer is written in the MUX bit stream multiplexer according to the following order: Flag_transient, Flag_huff_rms_core, Flag_huff_PLVQ_core and count_core, and then write the encoded bits of amplitude envelopes of the encoding subbands of the core layer in the MUX, and then write the encoded bits of the frequency domain coefficients of the core layer in the MUX; then, the secondary information of the extended layer is written in the MUX according to the following order: bit of the Huffman encoding indicator Flag_huff_rms_ext of the amplitude envelopes of the encoding sub-bands of the extended layer, Huffman encoding indicator bit Flag_huff_PLVQ_ext of frequency domain coefficients, and the number of times of the bit allocation correction count_ext iteration, then writing the encoded bits of amplitude envelopes of the encoding sub-bands of the extended layer (L_core, ..., L- 1) in MUX, and then write the encoded bits of the encoding signals of the extended layer in MUX; and finally the hierarchical stream of bits, which are written in the order mentioned above, is transmitted to a decoding terminal;
[00361] wherein, the writing order of the coded bits of the coding signals of the extended layer is organized according to the initial values of the importance of the coding sub-bands of the coding signals of the extended layer. That is, the coded bits of the coding sub-bands of the coding signals of the extended layer with a large initial value of importance, are preferably written in the bit stream, and for coding sub-bands of the same importance, the sub - low frequency coding band is preferred.
[00362] The amplitude envelopes of the residual signals in the extended layer are calculated according to the amplitude envelopes of the coding subbands of the core layer and the bit allocation numbers, therefore, there is no need for transmission to the terminal decoding. Thus, not only can the bandwidth encoding accuracy of the core layer be increased, but there is also no need to add bits for the transmission of residual signal amplitude values.
[00363] After rounding the bits that are unnecessary at the end of the bit stream multiplexer according to the bit rate that needs to be transmitted, the number of bits that meets the requirement in the bit rate is transmitted to the terminal decoding. That is, unnecessary bits are rounded in order of importance from the coding sub-bands from the least important to the most important.
[00364] In the present mode, the encoding frequency range is 0 ~ 13.6 kHz, the maximum bit rate is 64kpbs, and the hierarchical mode according to the bit rate is as follows:
[00365] the frequency domain coefficients within the encoding frequency range of 0 ~ 7 kHz are divided into a core layer, a maximum bit rate corresponding to the core layer is 32kbps, and the core layer is registered as layer L0; and the encoding frequency range of the extended layer is 0 ~ 13.6 kHz, its maximum bit rate is 64kbps, and the extended layer is registered as layer L1_5; and
[00366] before being transmitted to the decoding terminal, according to the number of bits that are rounded, the bit rates can be divided into an L1_1 layer corresponding to 36kbps, an L1_2 layer corresponding to 40kbps, a corresponding L1_3 layer at 48kbps, an L1_4 layer corresponding to 56kbps and an L1_5 layer corresponding to 64kbps.
[00367] FIG. 5 illustrates a relationship between a hierarchy according to a range of frequencies and a hierarchy according to a bit rate.
[00368] FIG. 6 is a structural diagram of a hierarchical audio coding system, in accordance with the present invention. As shown in FIG. 6, the system comprises: a transient detection unit, a frequency domain coefficient generation unit, an amplitude envelope calculation unit, an amplitude envelope encoding and quantification unit, a bit allocation unit of the core layer, a unit for encoding and quantifying vectors of frequency domain coefficients of the core layer, a unit for generating encoded signals from the extended layer, a unit for generating envelope amplitudes of residual signals, a unit bit allocation of the extended layer, and a coding unit and vector quantification of encoding signals from the extended layer, and a bit stream multiplexer; on what,
[00369] the transient detection unit is configured to perform a transient detection on an audio signal of a current frame;
[00370] the frequency domain coefficient generating unit is connected to the transient detection unit, and is configured for: when transient detection has to be a steady state signal, a time frequency transformation into a signal is performed of audio to obtain total frequency domain coefficients; when the transient detection is to be a transient signal, the audio signal is divided into M subframes, the time frequency transformation is performed in each subframe, the total frequency domain coefficients of the current frame are constituted by the M groups of coefficients frequency domain obtained by the transformation, the total frequency domain coefficients are reorganized so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies, in which the total frequency domain coefficients comprise domain coefficients frequency of the core layer and frequency domain coefficients of the extended layer, the coding subbands comprise coding subbands of the core layer and coding subbands of the extended layer, the frequency domain coefficients of the layer nuclei constitute several coding sub-bands of the nucleus layer, and the domain coefficients of f magnified layer frequency constitute several coding sub-bands of the extended layer;
[00371] the amplitude envelope calculation unit is connected to the frequency domain coefficient generation unit and is configured to calculate amplitude envelope values of the core layer coding sub-bands and coding sub-bands of the enlarged layer;
[00372] the amplitude envelope encoding and quantification unit is connected to the amplitude envelope calculating unit and the transient detection unit and is configured for the quantification and encoding of amplitude enveloping values of the coding sub-bands of the core layer and coding subbands of the extended layer, to obtain quantification indices of amplitude envelopes and encoded bits of amplitude envelopes of the coding subbands of the core layer and coding subbands of the extended layer; where, if the signal is the steady-state signal, the amplitude envelope values of the core layer coding sub-bands and the extended layer coding sub-bands are quantified together, and if the signal is the transient signal, the amplitude envelope values of the core layer coding sub-bands and the extended layer coding sub-bands are quantified separately, respectively, and the amplitude envelope indices of the amplitude sub-bands of coding of the core layer and the quantification indices of amplitude envelopes of the coding sub-bands of the extended layer are reorganized, respectively;
[00373] the core layer bit allocation unit is connected to the amplitude envelope encoding and quantification unit and is configured to perform bit allocation in the core layer coding sub-bands, according to the indices quantifying the amplitude surroundings of the coding subbands of the core layer, to obtain bit allocation numbers of the coding subbands of the core layer;
[00374] the unit of encoding and quantifying vectors of frequency domain coefficients of the core layer is connected to the unit of generation of frequency domain coefficients, to the unit of encoding and quantification of amplitudes and to the allocation unit bits of the core layer and is configured to: perform normalization, vector quantization and coding in the frequency domain coefficients of the coding subbands of the core layer, using the bit allocation numbers and the values of quantified amplitude envelopes of the coding subbands of the core layer reconstructed according to the quantification indices of amplitude envelopes of the coding subbands of the core layer to obtain coded bits of the frequency domain coefficients of the core layer;
[00375] the extended layer encoding signal generation unit is connected to the frequency domain coefficient generation unit and to the core layer frequency domain coefficient vector encoding and quantification unit, and is configured to generating residual signals, to obtain encoding signals of the extended layer comprised by the residual signals and frequency domain coefficients of the extended layer;
[00376] the residual signal amplitude envelope generation unit is connected to the amplitude envelope encoding and quantification unit and the core layer bit allocation unit, and is configured to obtain amplitude envelope quantification indices the residual signals from the core layer, according to the quantification indices of amplitude envelopes of the coding subbands of the core layer and bit allocation numbers of the corresponding coding subbands;
[00377] the extended layer bit allocation unit is connected to the residual signal amplitude envelope generation unit and to the amplitude envelope encoding and quantification unit, and is configured to perform bit allocation in the sub- coding bands of the amplified layer, according to the quantification indices of amplitude of the residual signals of the core layer, and quantification indices of amplitude of the encoding subbands of the amplified layer, to obtain the numbers of bit allocation of the encoding subbands of the extended layer;
[00378] the encoding and quantizing unit of amplified layer encoding signal vectors is connected to the amplitude encoding and quantizing unit, amplitude layer bit allocation unit, amplitude envelope generation unit residual signals, and the encoding signal generation unit of the extended layer, and is configured to: perform normalization, vector quantization and encoding in the encoding signals of the extended layer through the use of bit allocation numbers and the values of quantified amplitude envelopes of the coding subbands of the amplified layer coding signals, reconstructed according to the quantification indices of the amplitude of the amplitude of the coding subbands of the coding signals of the extended layer, to obtain coded bits of the encoding signals of the enlarged layer;
[00379] the bitstream multiplexer is connected to the amplitude encoding and quantizing encoding unit, to the vector encoding and quantifying unit of core layer frequency domain coefficients, encoding and vector quantization unit encoding signals of the extended layer, and is configured to package bits of secondary information from the core layer, the encoded bits of amplitude envelopes of the encoding subbands of the core layer, the encoded bits of the frequency domain coefficients of the core layer, secondary information bits of the extended layer, the encoded bits of encoding subband amplitude amplitudes of the extended layer and the encoded bits of the encoded signals of the extended layer.
[00380] The frequency domain coefficient generation unit is configured for: when obtaining the total frequency domain coefficients of the current frame, a 2N point x (n) time domain sampling signal is composed through a sampling signal of time domain x (n) of point N of the current frame and sampling signal of time domain xold (n) of point N of the last frame, and then window management and processing in effect of time domain ladder at x (n) to obtain a time domain sampling signal x (n) from point N; and an inversion processing is performed on the time domain signal x (^), then a sequence of zeros is added at both ends of the signal, respectively, the elongated signal is divided into subframes M that are superimposed on each other, and then window management, time domain ladder processing and the transformation of the time frequency into the time domain signal of each subframe are performed, to obtain the frequency domain coefficients M groups and then constitutes the total frequency domain coefficients of the current frame.
[00381] The frequency domain coefficient generation unit is further configured for: when reorganizing the frequency domain coefficients, it reorganizes the frequency domain coefficients respectively, so that their corresponding coding sub-bands are aligned from low frequencies at high frequencies inside the core layer and inside the extended layer.
[00382] The amplitude envelope quantification and encoding unit that reorganizes the amplitude envelope quantification indices serves specifically to: reorganize the amplitude envelope quantification indices of the coding sub-bands within the same subframe, together , so that their frequencies are aligned in ascending or descending order, and to connect them using two coding subbands that represent frequencies pairwise and that belong to two subplots, respectively, in delimitations of a subplot.
[00383] The bit stream multiplexer multiplexes and packages according to the following bit stream format:
[00384] first, bits of secondary information from the core layer are written, then the beginning of the bit stream plot, the bits encoded from amplitude envelopes of the coding sub-bands of the core layer are written in a multiplexer of bit streams (MUX) and then the encoded bits of the frequency domain coefficients of the core layer in MUX are written;
[00385] then, the secondary information bits of the extended layer are written in the MUX, then the encoded bits of amplitude encodings of the encoding sub-bands of the frequency domain coefficients of the extended layer in the MUX are written and then the encoded bits of the encoding signals of the enhanced layer in MUX; and
[00386] the number of bits that comply with the bit rate requirements is transmitted to the decoding terminal, according to the required bit rate.
[00387] The secondary information of the core layer comprises a transient detection indicator bit, a Huffman coding indicator bit of amplitude envelopes of the core layer coding sub-bands, a Huffman coding indicator bit , the frequency domain coefficients of the core layer and a bit of the number of times iteration of the core layer bit allocation correction.
[00388] The secondary information of the enlarged layer comprises a Huffman coding indicator bit of amplitude envelopes of the coding subbands of the enlarged layer, a Huffman coding indicator bit, of the encoded signals of the enlarged layer and a bit of the number of times iteration of the bit allocation correction of the extended layer.
[00389] The extended layer encoding signal generation unit further comprises a residual signal generation module and an extended layer encoding signal combination module;
[00390] the residual signal generation module is configured to reverse quantify the quantification values of the frequency domain coefficients of the core layer and perform a differential calculation with the frequency domain coefficients of the core layer, in order to obtain residual signals from the core layer; and
[00391] the encoding signal combination module of the extended layer is configured to combine the residual signals from the core layer and the frequency domain coefficients of the extended layer, in an order of frequency bands, to obtain the encoding of the enlarged layer.
[00392] The residual signal amplitude envelope generation unit also comprises a module for acquiring quantification index correction values and a module for calculating quantification indices for the amplitude of residual signals;
[00393] the module of acquisition of correction values of quantification indices is configured to search for a statistical table of correction values of the quantification indices of the amplitude of residual signal amplitudes of the core layer, according to the allocation numbers bits of the coding sub-bands of the core layer, to obtain correction values of the quantization indices of the coding sub-bands of the residual signals, where the correction value of the quantization index of each coding sub-band is greater than or equal to 0, and does not decrease when the bit allocation number of the corresponding core layer coding subband increases, and if the bit allocation number of the core layer coding subband is 0, the correction value of the quantification index of the residual signal of the core layer in this coding subband is 0, and if the bit allocation number of the subband is a bit allocation number defined maximum, the value of the residual signal amplitude envelopes in the subband is 0; and
[00394] the module for calculating the quantification indexes of residual signal amplitude envelopes is configured to perform a differential calculation between the amplitude quantification index of the encoding subband of the core layer and the correction value of the core layer. quantification index of the corresponding coding subband, to obtain the quantification index of amplitudes of the encoding subband of the residual signal of the core layer.
[00395] The bitstream multiplexer is further configured to write the encoded bits of the encoding signals of the extended layer in a bit stream in order of initial values of importance of the encoding subbands of the encoding signals of the extended layer , from small to large, and preferably writes the coded bits of low frequency coding subbands in the bit stream to the coding subbands of equal importance.
[00396] The specific functions of several units (modules) in FIG. 6 refer to the description of the process illustrated in FIG. 2, for more detailed purposes. DECODING METHOD AND SYSTEM
[00397] Based on the idea of the present invention, a hierarchical audio decoding method according to the present invention is shown in FIG. 7, and the decoding method comprises the following steps.
[00398] In step 701, a bit stream transmitted by an encoding terminal is demultiplexed, the encoded bits of encoding subband amplitudes of the core layer and encoding subbands of the extended layer are decoded, to obtain quantification indices of amplitude envelopes of the coding subbands of the core layer and coding subbands of the extended layer; if the transient detection information indicates a transient signal, the quantification indices of amplitude envelopes of the coding subbands of the core layer and of the coding subbands of the extended layer are additionally rearranged, respectively, so that their corresponding frequencies aligned from low to high.
[00399] In step 702, a bit allocation is performed on the encoding subbands of the core layer according to the amplitude quantification indices of the encoding subbands of the core layer, and thus, the indices of quantification of residual amplitude envelopes of the core layer are calculated, and the bit allocation is performed in the coding subbands of the amplified layer encoding signals, according to the quantification indices of signal amplitude envelopes residuals of the core layer and the quantification indices of amplitude envelopes of the coding sub-bands of the extended layer.
[00400] The method of calculating the quantification indices for the amplitude of the residual signals involves: looking for a statistical table of correction values of the quantification indices for the amplitude of the residual signals of the core layer, according to the numbers of allocation of bits from the core layer, in order to obtain correction values of the quantification indexes of amplitude of the residual signals of the core layer; and perform a differential calculation between the quantification indexes of amplitude envelopes of the encoding subbands of the core layer and the correction values of the quantification indices of amplitude of the envelope of the residual signals of the core layer of the coding subbands corresponding, to obtain the quantification indices for the amplitude of the residual signals of the core layer; on what,
[00401] the correction value of the quantification index of the amplitude of the residual signal amplitude of the core layer of each coding subband is greater than or equal to 0, and does not decrease when the number of bit allocation of the subband encoding of the corresponding core layer increases; and
[00402] when the bit allocation number of a given core layer coding subband is 0, the correction value of the amplitude quantification index of amplitude of the residual signal of the core layer is 0, and when the bit allocation number of a given core layer coding subband is a defined maximum bit allocation number, the residual signal amplitude envelope value of the corresponding core layer is 0.
[00403] In step 703, the encoded bits of frequency domain coefficients of the core layer and encoded bits of encoding signals of the extended layer are decoded, respectively, according to the bit allocation numbers of the core layer and the extended layer, to obtain the frequency domain coefficients of the core layer and encoded signals of the extended layer, and the encoded signals of the extended layer are rearranged in an order of subbands and then added to the domain coefficients of frequency of the core layer, to obtain coefficients of total bandwidth frequency domain.
[00404] In step 704, if the transient detection information indicates a steady state signal, an inverse time frequency transformation is performed directly on the frequency domain coefficients of the total bandwidth, to obtain an audio signal for output; and if the transient detection information indicates a transient signal, the frequency domain coefficients of the total bandwidth are rearranged, then dividing them into groups M of frequency domain coefficients, the inverse time frequency transformation is performed in each group of frequency domain coefficients, and a final audio signal is calculated to obtain, according to the groups M of the time domain signals obtained by the transformation.
[00405] The encoded bits of the encoded signals of the extended layer are decoded in the following order.
[00406] In the extended layer, the decoding order of the encoded bits of the encoded signals of the extended layer is determined according to initial values of the importance of the encoding subbands of the corresponding encoded signals of the extended layer; that is, the coding subbands of the coding signals of the enlarged layer of the greatest importance are preferably decoded, and if there are two coding subbands of the coding signals of the enlarged layer of the same importance, then the sub - low frequency coding band is preferably decoded, and the number of decoded bits is calculated in the decoding process, and when the number of decoded bits meets the requirements of the total number of bits, decoding is interrupted.
[00407] FIG. 8 is a flow chart of an embodiment of a hierarchical audio decoding method according to the present invention. As shown in FIG. 8 the method comprises the following steps.
[00408] In step 801, the encoded bits of a frame are extracted from the hierarchical bit stream transmitted by the encoding terminal (i.e. from the DeMUX bit stream demultiplexer).
[00409] after the extraction of the encoded bits, the secondary information is first decoded, and then Huffman decoding or direct decoding is performed in encoded bits of amplitude envelopes of the core layer in this frame, according to the value of Flag_huff_rms_core, to obtain the quantification indices of amplitude envelopes Thq (j), j = 0, ..., L_core-1 of the coding sub-bands of the core layer.
[00410] In step 802, the initial values of importance of the coding subbands of the core layer are calculated according to the quantification indexes of amplitude of amplitude of coding subbands of the core layer, and an allocation of bits are executed in the coding subbands of the core layer using the importance of the subbands, in order to obtain the bit allocation number of the core layer; the method of allocating bits from the decoding terminal is the same as the method of allocating bits from the encoding terminal, completely. In the bit allocation process, the unit range of the bit allocation and the unit range of the reduction in weight of the coding subbands after the bit location are variable.
[00411] After the completion of the process over bit allocation, the bit allocation is performed again in the encoding subbands of the core layer by count_core times, according to a value of the count_core number of times of the allocation correction of bits of the core layer at the coding terminal and the importance of the coding subbands of the core layer, and then the whole bit allocation process ends.
[00412] In the bit allocation process, the unit range for allocating the bit for the coding subband, of which the bit allocation is 0, is 1 bit and the unit range of the reduction in weight after the bit allocation is 1; the unit range of the bit allocation is 0.5 bit when the bit is additionally allocated to the coding subband of which the bit allocation number is greater than 0 and less than a given threshold, and the range of unity of unimportance after bit allocation is also 0.5; and the unit range of the bit allocation is 1, when the bit is additionally allocated to the coding subband, of which the number of bit allocation is greater than or equal to that threshold, and the unit range of the bit reduction importance after bit allocation is also 1.
[00413] In step 803, the decoding, inverse quantization and inverse normalization processes are performed on the encoded bits of the frequency domain coefficients of the core layer using the bit allocation numbers of the encoding subbands of the layer of the layer nucleus and quantified amplitude envelope values of the core layer coding subbands according to Flag_huff_PLVQ_core, to obtain the frequency domain coefficients of the core layer.
[00414] In step 804, when performing decoding, inverse quantization on the encoded bits of the frequency domain coefficients of the core layer, the encoding subbands of the core layer are divided into reduced bit coding subbands and high bit coding subbands, according to the bit allocation numbers of the core layer coding subbands, and reverse quantization is performed on the reduced bit coding subbands and coding subbands. high bits using a method of inverse quantification / quantification of structure vectors by pyramid cells and a method of inverse quantification / quantification of structure vectors by spherical cells, respectively.
[00415] Huffman decoding is performed on the reduced bit coding sub-bands, or natural decoding is performed directly on the reduced bit coding sub-bands, according to the secondary layer layer information to obtain the Pyramid cell structure vector quantification indices of the reduced bit coding sub-bands, and inverse quantization and inverse normalization are performed on all pyramid cell structure vector quantization indices, to obtain the coefficients frequency domain of the coding subbands. The process of inverse quantification / quantification of pyramid cell structure vectors will be described below:
[00416] a, for all j = 0, ..., L_core-1, if Flag_huff_PLVQ_core = 0, the vector quantization index mth index_b (j, m) of the reduced bit coding subband is already obtained by direct decoding; and if Flag_huff_PLVQ_core = 1, the vector quantization index mthindex_b (j, m) of the reduced bit coding subband is already obtained according to the Huffman coding code frame corresponding to the bit allocation number of a coefficient single frequency domain name of the coding subband.
[00417] When the number of bits allocated to a single frequency domain coefficient of the coding subband is 1, and if the natural binary code value of the quantization index is less than “1111 111”, the quantization index is calculated according to the natural binary code value; and if the quantification index's natural binary code value equals “1111 111”, it will continue to read the next bit to enter, and if the next bit is 0, the quantization index value is 127, and if the next bit is 1, the quantization index value is 128.
[00418] b, the process of inverse quantification of pyramid cell structure vectors of quantification indices is an inverse process of vector quantification 108, as follows:
[00419] 1) an energy pyramid surface, on which the vector quantification index is located and where a label on that energy pyramid surface is determined.
[00420] kké sought in pyramid surface energy from 2 to LargeK (region_bit (j)), so that the following inequality is fulfilled:
[00421] N (8, kk) <= index_b (j, m) <N (8, kk + 2),
[00422] If such a kk is found, then K = kk is the energy of the pyramid surface where the grid point D8 is located, which corresponds to the quantification index index_b (j, m), b = index_b (j, m) -N (8, kk) is an index label for grid point D8 on the pyramid surface where grid point D8 is located;
[00423] If this kk cannot be found, then the energy of the pyramid surface of the grid point D8, which corresponds to the quantification index index_b (j, m), is K = 0, and the index label is b = 0.
[00424] 2) the specific resolution steps of the grid point vector D8 Y = (y1, y2 y3, y4, y5, y6, y7, y8,), whose pyramid surface energy is K and the index tag is b, it is as follows:
[00425] in step 1, if Y = (0,0,0,0,0,0,0,0), xb = 0, i = 1, k = K, l = 8;
[00426] in step 2, if b = xb, then yi = 0; and skip to step 6;
[00427] in step 3, if b <xb + N (l-1, k), then yi = 0, and jump to step 5;
[00428] otherwise, xb = xb + N (l-1, k); and j = 1 is estimated;
[00429] in step 4, if b <xb + 2 * N (l-1, k-j), then
[00430] if xb <= b <xb + N (l-1, k-j), then yi = j;
[00431] if b> = xb + N (l-1, k-j), then yi = -j, xb = xb + N (l-1, k-j);
[00432] otherwise, xb = xb + 2 * N (l-1, k-j), j = j + 1; and the present step continues;
[00433] in step 5, update k = k- | yi |, l = l-1, i = i + 1, and if k> 0, then jump to step 2;
[00434] in step 6, if k> 0, then y8 = k- | yi |, and Y = (y1, y2, ..., y8) is the resolved grid point.
[00435] 3) the energy of the resolved D8 grid point is regulated in an inverse way, to obtain: Ym = (Y + a) / scale (index)
[00436] where, a = (2-6.2-6.2-6.2-6.2-6.2-6.2-6.2-6), scale (index) is a factor of scale, which can be found in Table 5.
[00437] 4) the inverse normalization process is performed in Ym, to obtain the frequency domain coefficient of the mth vector of the coding subband j which is recovered by the decoding terminal: Thq (j) / 2
[00438] where, Thq (j) is the quantification index of amplitudes of amplitude of the coding subband jth.
[00439] Natural decoding is performed directly on the coded bits of the high bit coding sub-bands to obtain the mthk index vector of the high bit coding sub-band j, and the execution of the reverse quantization process of quantization of spherical cell structure vectors in this index vector is effectively an inversion processing of the quantification process, and the specific steps are as follows:
[00440] a, x = k * G is calculated, and ytemp = x / (2A (region_bit (j)) is calculated; where, k is a vector vector of quantification index, and region _ bit (j) represents the bit allocation number of a simple frequency domain coefficient in the coding subband j; G is a grid point generation matrix D8, and the shape is as follows:

[00441] b, y = x - fDa (ytemp) * (2A (region_bit (j)) is calculated;
[00442] c, the energy of the resolved grid points D8 is regulated in reverse, to obtain: Ym = y * scale (region _ bit (j)) / (2 region-bit (j)) + a,
[00443] where, a = (2-6,2-6,2-6,2-6,2-6,2-6,2-6,2-6), scale (region - bit (j) ) is a scale factor, which can be found in Table 10.
[00444] d) the inverse normalization process is performed in Ym, to obtain the frequency domain coefficients of the mth vector of the coding subband j which is retrieved by the decoding terminal: mm Th ^ q (j) / 2 Ym
[00445] where, Thq (j) are the quantification indices of amplitude of the amplitude of the jth coding subband.
[00446] In step 805, the quantification indexes of amplitude envelopes of the sub-bands of the core layer are calculated by using the indices of quantification of amplitude envelopes of the coding subbands of the core layer and the bit allocation numbers of the core layer encoding subbands; and the method of calculating the decoding terminal is exactly the same as in the encoding terminal.
[00447] Huffman encoding or direct encoding is performed on the encoded bits of amplitude envelopes of the encoding sub-bands of the extended layer, according to a value of Flag_huff_rms_ext, to obtain the quantification indices of envelopes of amplitude Thq ( j), j =, L_core, ..., L -1 of the coding subbands of the extended layer.
[00448] In step 806, the encoding signals of the extended layer are comprised of the residual signals from the core layer and frequency domain coefficients of the extended layer, the initial values of the importance of the encoding subbands of the encoding signals of the layer amplified are calculated according to the quantification indices of amplitude envelopes of the coding subbands of the coding signals of the extended layer, and the bit allocation is performed in the coding subbands of the coding signals of the extended layer through the use of the initial values of the importance of the coding subbands of the coding signals of the extended layer, to obtain the number of bits allocation of the coding subbands of the coding signals of the extended layer.
[00449] The method of calculating the initial values of the importance of the coding sub-bands of the decoding terminal and bit allocation method are the same as those of the coding terminal.
[00450] In step 807, the encoding signals of the enlarged layer are calculated.
[00451] Decoding and inverse quantization are performed on the encoded bits of the encoding signals using the bit allocation numbers of the encoding signals of the extended layer, and inverse normalization is performed on the inversely quantized data, using the values of quantified amplitude envelopes of the coding subbands of the amplified layer coding signals, to obtain the amplified layer coding signals.
[00452] The decoding and reverse quantification methods of the extended layer are the same as in the core layer.
[00453] In the present step, the decoding order of the coding subbands of the coding signals of the extended layer is determined according to the initial values of the importance of the coding subbands of the coding signals of the extended layer. If there are two coding subbands of the coding signals of the extended layer of equal importance, the low frequency coding subband is preferably decoded and in the meantime, the number of the decoded bits is calculated and when the number of decoded bits comply with the total number of bits requirement, decoding is interrupted.
[00454] For example, the bit rate of transmission from the coding terminal to the decoding terminal is 64kbps; however, for reasons inherent to the network, the decoding terminal can only obtain 48kbps information in the initial bit stream, or the decoding terminal will only support 48kbps decoding, and therefore decoding is interrupted when the terminal decode decode the 48kbps.
[00455] In step 808, the encoding signals obtained by decoding in the extended layer are rearranged in an order of subbands and the frequency domain coefficients of the core layer with the same frequencies are added with the encoding signals of the layer extended to obtain the output values of the frequency domain coefficients.
[00456] In step 809, filling noise is performed in the sub-bands for which the encoded bits are not allocated in the encoding process or in the sub-bands that are lost in the transmission process.
[00457] In step 810, when a flag_transient transient detection indicator bit is 1, the frequency domain coefficients are rearranged, that is, all frequency domain coefficients corresponding to sub-bands L in Table 2, are rearranged in corresponding locations of the original indexes of the frequency domain coefficients, and the frequency domain coefficients corresponding to the indexes of frequency domain coefficients that are not referred to in Table 2 are set to 0.
[00458] In step 811, the inverse time frequency transformation is performed on the frequency domain coefficients to obtain the final audio output signal. The specific steps are as follows.
[00459] When the transient detection indicator bit Flag_transient is 0, a DCTIV reverse transformation whose range is Né performed on the frequency domain coefficient of point N, to obtain), / = 0, .. A'-1.
[00460] When the flag_transient transient detection indicator bit is 1, the frequency domain coefficients of point N are first divided into 4 groups with the same range and inverse time domain ladder effect processing and DCTIV reverse transformation whose range is N / 4 are performed in each group of frequency domain coefficients, and then a window management process (the window structure is the same as in the coding terminal) is performed on the 4 groups of signals obtained, and then the 4 groups of windowed signals are superimposed and added to obtain x * (n) 9n = .9 ... 9N-1.
[00461] The inverse time domain ladder effect processing and window management processing (the window structure is the same as in the coding terminal) is performed in), / = (), ..., A'-1. Two adjacent frames are superimposed and added to obtain the final audio output signal.
[00462] FIG. 9 is a structural diagram of a hierarchical audio decoding system, in accordance with the present invention. As shown in FIG. 9, the system comprises: a bit stream demultiplexer (DeMUX), a decoding unit for amplitude envelopes of the coding subbands of the core layer, a bit allocation unit of the core layer, a quantizing unit core layer decoding and encoding unit, a residual signal amplitude envelope generation unit, an extended layer bit allocation unit, a reverse quantization and decoding encoder signal unit of the extended layer, a recovery of total bandwidth frequency domain coefficients, a filling noise unit and an audio signal recovery unit; on what
[00463] the amplitude envelope decoding unit is connected to the bitstream demultiplexer, and is configured to: decode the encoded bits of amplitude envelopes of the core layer coding subbands and coding subbands the expanded layer, whose output is given by the bit stream demultiplexer, to obtain quantification indices of amplitude envelopes of the coding subbands of the core layer and coding subbands of the expanded layer; and if the transient detection information indicates a transient signal, the quantification indices for the amplification of amplitudes of the coding sub-bands of the core layer and of the coding sub-bands of the extended layer are additionally reorganized, so that their corresponding frequencies are aligned from low to high, within the respective layers;
[00464] the core layer bit allocation unit is connected to the amplitude envelope decoding unit and is configured to perform bit allocation in the core layer coding sub-bands, according to the quantification indices amplitude surrounds of the core layer coding subbands to obtain bit allocation numbers of the core layer coding subbands;
[00465] the inverse quantization and decoding unit of the core layer is connected to the bitstream demultiplexer, amplitude decoding unit of amplitude and bit allocation unit of the core layer, and is configured to: make the calculation to obtain quantified amplitude envelope values of the core layer coding subbands, according to the encoding subband amplitude indexes of the core layer coding, perform decoding, inverse quantification and process of inverse normalization in encoded bits of frequency domain coefficients of the core layer, sent by the bitstream demultiplexer through the use of bit allocation numbers and quantized amplitude envelope values of the encoding sub-bands of the core, to obtain frequency domain coefficients of the core layer;
[00466] the residual signal amplitude envelope generation unit is connected to the amplitude envelope decoding unit and the core layer bit allocation unit, and is configured to: search for a statistical table of correction values of the quantification indexes of amplitude envelopes of the residual signals of the core layer, according to the quantification indices of amplitude envelopes of the coding subbands of the core layer and bit allocation numbers of the corresponding coding subbands , to obtain the quantification indices for the amplitude of the residual signals of the core layer;
[00467] the expanded layer bit allocation unit is connected to the residual signal amplitude envelope generation unit and the amplitude envelope decoding unit, and is configured to: perform bit allocation in sub-bands of encoding of encoding signals of the extended layer, according to the quantification indices of amplitude of the residual signals of the core layer, and indices of quantification of amplitudes of the encoding sub-bands of the extended layer, to obtain numbers bit allocation of the encoding subbands of the encoded signals of the extended layer;
[00468] the inverse quantification and decoding unit of encoded signals of the extended layer is connected to the bit stream demultiplexer, amplitude envelope decoding unit, amplified layer bit allocation unit, and envelope generation unit amplitude of residual signals, and is configured to: perform the calculation to obtain quantified amplitude envelope values of the coding sub-bands of the amplified layer encoding signals, using the quantification indices of amplitude envelopes encoding subbands of encoded signals from the extended layer, and perform decoding, inverse quantization and inverse normalization process on encoded bits of encoding signals from the extended layer, sent by the bitstream demultiplexer using the allocation numbers bits and quantized amplitude envelope values of codeband bands fication of the encoding signals of the enlarged layer, to obtain the encoding signals of the enlarged layer;
[00469] the recovery unit of full bandwidth frequency domain coefficients is connected to the core layer reverse quantification and decoding unit and the extended layer encoding signals reverse quantization and decoding unit, and is configured to : rearrange the encoding signals of the extended layer, sent by the inverse quantization and decoding unit of encoding signals of the extended layer, in an order of the subbands, and then add them with the frequency domain coefficients of the core layer , sent by the unit of inverse quantification and decoding of the core layer, to obtain the frequency domain coefficients of the total bandwidth;
[00470] the filling noise unit is connected to the recovery unit of total bandwidth frequency domain coefficients and the amplitude decoding unit of amplitude, and is configured to perform filling noises in subbands for the which encoded bits are not allocated in the encoding process;
[00471] the audio signal recovery unit is connected to the filling noise unit, and is configured for: if the transient detection information indicates a steady state signal, an inverse time frequency transformation is performed directly on the coefficients of frequency domain of the total bandwidth, to obtain an audio signal for output; and if the transient detection information indicates a transient signal, the frequency domain coefficients of the total bandwidth are reorganized, then dividing them into groups of frequency domain coefficients M, the inverse time frequency transformation is performed in each group of frequency domain coefficients, and the calculation is made to obtain a final audio signal according to the groups M of the time domain signals obtained by the transformation.
[00472] The residual signal amplitude envelope generation unit also comprises a module for the acquisition of correction values for quantification indices and a module for the calculation of quantification indices for the amplitude of residual signals;
[00473] the module of acquisition of correction values of quantification indices is configured to search for a statistical table of correction values of the quantification indices of amplitudes of amplitudes of the residual signals of the core layer, according to the allocation numbers bits of the coding subbands of the core layer, to obtain correction values of the quantization index of the coding subbands of the residual signals, where the correction value of the quantization index of each coding subband is greater than or equal to 0, and does not decrease when the bit allocation number of the corresponding core layer coding subband increases, and if the bit allocation number of a given layer coding subband core value is 0, the correction value of the quantification index of the residual signal of the core layer in this coding subband is 0, and if the bit allocation number of a given coding subband d the core layer is a defined maximum bit allocation number, the value of the residual signal amplitude envelopes in that coding subband is 0; and
[00474] the module for calculating the quantification indexes of residual signal amplitude envelopes is configured to perform a differential calculation between the amplitude quantification index of the encoding subband of the core layer and the correction value of the core layer. quantification index of the corresponding coding subband, to obtain the quantification index of amplitudes of the encoding subband of the residual signal of the core layer.
[00475] The unit of reverse quantification and decoding of encoded signals of the extended layer is still configured to: determine the decoding order of the sub-bands encoding the encoding signals of the extended layer, according to initial values of importance of the coding subbands of the coding signals of the extended layer, preferably decoding the coding subbands of the coding signals of the extended layer with the greatest importance; and if there are two coding subbands of the coding signals of the extended layer of equal importance, the coding subbands with low frequency are preferably decoded, and the number of bits decoded in the decoding process is calculated; and when the number of bits decoded meets the requirements of the total number of bits, the decoding is interrupted.
[00476] The decoding order of the encoding subbands of the encoded signals of the extended layer by the reverse quantification unit and decoding of encoding signals of the extended layer, is determined according to initial values of importance of the encoding subbands. encoding signals from the extended layer, preferably decoding the encoding subbands of the encoding signals from the extended layer with the greatest importance; and if there are two coding subbands of the coding signals of the extended layer of equal importance, the coding subbands with low frequency are preferably decoded, and the number of bits decoded in the decoding process is calculated; and when the number of bits decoded meets the requirements of the total number of bits, the decoding is interrupted.
[00477] the reorganization of frequency domain coefficients of the total bandwidth by the recovery unit is specifically: the reorganization of frequency domain coefficients so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies within the respective subframes, to obtain M groups of frequency domain coefficients, and then rearrange the M groups of frequency domain coefficients in an order of subplots.
[00478] If the transient detection information indicates a transient signal, the calculation process to obtain the final audio signal by the audio signal recovery unit, according to the time domain signal M groups obtained by the transformation , specifically comprises: the execution of an inverse time domain stair effect processing on each group of time domain signals, then executing a window management process on the obtained M groups of signals and then overlapping and adding the groups M of window signals, to obtain a time domain sampling signal x * (n) of point N; and running the inverse time domain stair effect process and the window management process on the time domain signal, and overlapping and adding two adjacent frames to obtain the final audio output signal.
[00479] The present invention further provides hierarchical encoding and decoding methods for transient signals as follows.
[00480] The hierarchical audio encoding method for the transient signals according to the present invention comprises:
[00481] A1, the division of an audio signal into subframes M the execution of a time frequency transformation in each subframe, the groups M of the frequency domain coefficients obtained by the transformation constituting total frequency domain coefficients of a frame current, the reorganization of the total frequency domain coefficients so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies, in which the total frequency domain coefficients comprise core layer frequency domain coefficients and frequency domain coefficients of the extended layer, the coding subbands comprise core layer coding subbands and coding subbands of the extended layer, the frequency layer coefficients of the core layer constitute several sub-bands. coding bands of the core layer, and the frequency domain coefficients of the extended layer constitute m several coding subbands of the extended layer;
[00482] B1, the quantification and encoding of enveloping amplitude values of the encoding sub-bands of the core layer and encoding sub-bands of the extended layer, in order to obtain quantification indices of amplitude envelopes and encoded bits of the sub- core layer coding bands and extended layer coding subbands; where the values of amplitude envelopes of the coding subbands of the core layer and of the coding subbands of the extended layer are quantified separately, respectively, and the indices of quantification of amplitude envelopes of the coding subbands of the core layer and the quantification indices of amplitude envelopes of the coding sub-bands of the extended layer are reorganized respectively;
[00483] C1, the execution of a bit allocation in the coding subbands of the core layer, according to the quantification indices of amplitude of the encoding subbands of the core layer, and then quantifying and coding the frequency domain coefficients of the core layer to obtain encoded bits of the frequency domain coefficients of the core layer;
[00484] D1, the inverse quantification of the frequency domain coefficients described above in the core layer, which are performed with a vector quantification, and the performance of a differential calculation with original frequency domain coefficients, which they are obtained after being executed with the transformation of the time frequency, to obtain residual signals from the core layer;
[00485] E1, the calculation of the quantification indexes of the amplitude of the sub-bands encoding the residual signals of the core layer, according to the quantification indices of the amplitude envelopes and the bit allocation numbers of the sub-bands core layer coding;
[00486] F1, the execution of a bit allocation in the coding sub-bands of the encoded signals of the extended layer according to the quantification indices of amplitude evolvent of the residual signals of the core layer and with the quantization indices of amplitude envelopes of the encoding subbands of the extended layer, and then quantifying and encoding the encoding signals of the extended layer to obtain encoded bits of the encoding signals of the extended layer, where the encoding signals of the extended layer are comprised by residual core layer signals and frequency domain coefficients of the extended layer; and
[00487] G1, the multiplexing and packaging of the encoded bits of amplitude envelopes of the encoding subbands of the core layer and encoding subbands of the extended layer, the encoded bits of the frequency domain coefficients of the core layer and the encoded bits of the encoding signals of the extended layer and then proceed to transmission to a decoding terminal.
[00488] In step A1, the method for obtaining the total frequency domain coefficients of the current frame comprises:
[00489] the composition of a 2N point x (n) time domain sampling signal via a N point x (n) time domain sampling signal of the current frame and a time domain sampling signal xold (n) of point N of the last frame, and then perform window management and stair effect processing in the time domain at x (n) to obtain a point x (n) time domain sampling signal N; and
[00490] perform an inversion processing on the time domain signal x (n), and then add a sequence of zeros at both ends of the signal, respectively, dividing the elongated signal into subframes M that are superimposed on each other, and then perform window management, time domain stair effect processing and time frequency transformation into the time domain signal of each subframe, to obtain M groups of frequency domain coefficients and then constitute the coefficients frequency domain totals for the current frame.
[00491] In step A1, when reorganizing the frequency domain coefficients, the frequency domain coefficients are reorganized so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies within the core layer and inside the magnified layer.
[00492] In step B1, the reorganization of quantification indices of amplitude surroundings comprises specifically:
[00493] the reorganization of the quantification indexes of amplitude surroundings of the coding sub-bands within the same subframe, together, so that their corresponding frequencies are aligned in an ascending or descending order of frequencies, and make the connection through the use of two coding sub-bands that represent frequencies pair-by-pair and that belong to two subplots, respectively, in delimitations of a subplot.
[00494] In step G1, demultiplexing and packaging are performed according to the following bit stream format:
[00495] first, if writing bits of secondary information from the core layer, after the beginning of the bit stream plot, writing the encoded bits of amplitude envelopes of the coding sub-bands of the core layer in a multiplexer of bit streams (MUX) and then the encoded bits of the frequency domain coefficients of the core layer in MUX are written;
[00496] then, the secondary information bits of the extended layer are written in the MUX, then the encoded bits of amplitude encoding of the sub-bands encoding the frequency domain coefficients of the extended layer in the MUX are written and then the encoded bits of the encoding signals of the enhanced layer in the MUX; and
[00497] the number of bits that comply with the bit rate requirements is transmitted to the decoding terminal, according to the required bit rate.
[00498] The secondary information of the core layer comprises a transient detection indicator bit, a Huffman coding indicator bit of amplitude envelopes of the core layer coding sub-bands, a Huffman coding indicator bit , the frequency domain coefficients of the core layer and a bit of the number of times iteration of the core layer bit allocation correction.
[00499] The secondary information of the extended layer comprises a Huffman encoding indicator bit of amplitude envelopes of the encoding subbands of the extended layer, a Huffman encoding indicator bit, of the encoding signals of the extended layer and a bit of the number of times iteration of the bit allocation correction of the extended layer.
[00500] The hierarchical encoding method for transient signals according to the present invention comprises:
[00501] in step A2, the demultiplexing of a bit stream transmitted by an encoding terminal, the decoding of the encoded bits of amplitude envelopes of the encoding subbands of the core layer and of encoding subbands of the extended layer , to obtain quantification indexes of amplitude envelopes of the coding subbands of the core layer and coding subbands of the extended layer, the reorganization of the quantification indices of amplitude envelopes of the coding subbands of the layer core and coding sub-bands of the extended layer, respectively, so that their corresponding frequencies are aligned from low to high within the respective layers;
[00502] in step B2, the execution of a bit allocation in the coding sub-bands of the core layer, according to the quantification indices of amplitude envelopes reorganized in the coding sub-bands of the core layer, and so on calculate the quantification indexes of residual signal amplitude envelopes of the core layer;
[00503] in step C2, the execution of bit allocation in the coding sub-bands of the encoded signals of the extended layer, according to the quantification indices of amplitude of the amplitude of the residual signals of the core layer, and the indices of quantification of reorganized amplitude surroundings of the coding sub-bands of the extended layer;
[00504] in step D2, the encoded bits of frequency domain coefficients and encoded bits of encoding signals of the extended layer are decoded, respectively, according to the bit allocation numbers of the core layer and the extended layer, in order to obtain the coefficients of the frequency domain of the core layer and coding signals of the extended layer, and the reorganization of the coding signals of the extended layer in an order of sub-bands and then the addition to the frequency domain coefficients of the core layer , to obtain frequency domain coefficients of total bandwidth; and
[00505] in step E2, the reorganization of the frequency domain coefficients of the total bandwidth and then do the division into M groups, perform an inverse time frequency transformation in each group of frequency domain coefficients, and do the calculation to obtain a final audio signal according to the M groups of time domain signals obtained by the transformation.
[00506] in step E2, the reorganization of frequency domain coefficients of the total bandwidth specifically comprises the reorganization of frequency domain coefficients so that their corresponding coding subbands are aligned from low frequencies to high frequencies within the respective subframes, to obtain M groups of frequency domain coefficients, and then organize the M groups of frequency domain coefficients in an order of subplots.
[00507] In step E2, the calculation process for obtaining the final audio signal according to groups M of time domain signals obtained by transformation, comprises: the execution of a ladder effect processing of the inverse time domain in each group, then running a window management process on the M groups of signals obtained and then overlapping and adding the M groups of windowed signals, to obtain a time domain sampling signal x («) N; and perform the inverse time domain stair effect process and the window management process on the time domain signal x (X), and superimpose and add two adjacent frames to obtain the final audio output signal . INDUSTRIAL APPLICABILITY
[00508] In the present invention, when introducing a processing method for transient signal frames in the hierarchical audio encoding and decoding methods, a segmented time frequency transformation is performed in the transient signal frames, and then the domain domain coefficients. frequency obtained by the transformation are reorganized respectively inside the core layer and inside the extended layer, in order to perform the same subsequent coding processes, such as bit allocation, coding of frequency domain coefficients, etc., such as such as those existing in steady state signal frames, thus improving the coding efficiency of transient signal frames and improving the quality of hierarchical audio encoding and decoding.

权利要求:
Claims (18)
[0001]
1. HIERARCHICAL AUDIO CODING METHOD, characterized by understanding: the execution of a transient detection in an audio signal of a current plot; when the transient detection has to be a steady state signal, a time frequency transformation into an audio signal is performed to obtain total frequency domain coefficients; when the transient detection is to be a transient signal, the audio signal is divided into M subframes, the time frequency transformation is performed in each subframe, the M groups of the frequency domain coefficients obtained by the transformation constitute the total coefficients of frequency domain of the current frame, the total frequency domain coefficients are reorganized so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies, in which the total frequency domain coefficients comprise coefficients of frequency domain of the core layer and frequency domain coefficients of the extended layer, the coding subbands comprise coding subbands of the core layer and coding subbands of the extended layer, the frequency domain coefficients of the core layer constitute several coding sub-bands of the core layer, and the domain coefficients frequency of the extended layer constitute several coding sub-bands of the extended layer; the quantification and coding of amplitude envelope values of the coding subbands of the core layer and coding subbands of the extended layer, to obtain quantification indices of amplitude envelopes and coded bits of amplitude envelopes of the sub- core layer coding bands and extended layer coding subbands; where, if the signal is the steady-state signal, the amplitude envelope values of the core layer coding sub-bands and the extended layer coding sub-bands are quantified together, and if the signal is the transient signal, the values of amplitude envelopes of the coding subbands of the core layer and of the coding subbands of the extended layer are quantified separately, respectively, and the quantification indices of amplitude envelopes of the coding subbands of the core layer. core layer and the quantification indices of amplitude envelopes of the coding sub-bands of the extended layer are reorganized respectively; the execution of a bit allocation in the coding subbands of the core layer, according to the quantification indexes of amplitude of the encoding subbands of the core layer, and then quantifying and coding the domain coefficients of core layer frequency to obtain encoded bits of the core layer frequency domain coefficients; inversely quantify the frequency domain coefficients described above in the core layer, which are performed with a vector quantification, and the performance of a differential calculation with original frequency domain coefficients, which are obtained after being performed with transforming the time frequency to obtain residual signals from the core layer; the calculation of quantification indices of amplitude envelopes of residual signals from the core layer, according to bit allocation numbers and quantification indices of amplitude envelopes of the coding sub-bands of the core layer; execution of bit allocation in the coding sub-bands of the encoded signals of the extended layer according to the quantification indices of amplitude evolvent of the residual signals of the core layer and with the quantification indices of the amplitudes of the sub-amplitude encoding bands of the extended layer, and then quantifying and encoding the encoding signals of the extended layer to obtain encoded bits of the encoding signals of the extended layer, wherein the encoding signals of the extended layer are comprised of the residual core layer signals and frequency domain coefficients of the extended layer; and the multiplexing and packaging of the encoded bits of amplitude envelopes of the encoding subbands of the core layer and encoding subbands of the extended layer, the encoded bits of the frequency domain coefficients of the core layer and the encoded bits of the encoding signals from the extended layer and then transmitting to a decoding terminal.
[0002]
2. METHOD, according to claim 1, characterized in that, when the transient detection is to be the transient signal and the frequency domain coefficients are reorganized, the frequency domain coefficients are reorganized so that their sub- corresponding coding bands are aligned from low frequencies to high frequencies within the core layer and within the extended layer respectively.
[0003]
3. METHOD according to claim 2, characterized in that, when the respective reorganization is carried out within the core layer and within the extended layer, if the frequency domain coefficients that remained in a group are not sufficient to constitute a sub -band, then a supplement is performed using frequency domain coefficients with the same frequencies, or with similar frequencies, in the next group of frequency domain coefficients.
[0004]
4. METHOD, according to claim 1 or 2, characterized by the indices of the frequency domain coefficients in the coding sub-bands, being as follows after the reorganization:
[0005]
5. METHOD, according to claim 1, characterized by further comprising: when the transient detection is to be a steady state signal, a Huffman coding is performed on the quantification indices of amplitude of the amplitude encoding subband bands. core layer obtained by quantification; and if the total number of bits consumed, after Huffman coding is performed on the amplitude quantification indices of amplitude of all coding subbands of the core layer, is less than the total number of bits consumed after natural coding is performed on quantification indexes of amplitude envelopes of all sub-bands of core layer coding, Huffman coding is used, otherwise, natural coding is used, and the Huffman coding indicator of amplitude envelopes is defined core layer coding subbands; and Huffman coding is performed on the quantification indexes of amplitude envelopes of the coding sub-bands of the extended layer obtained by quantification; and if the total number of bits consumed, after Huffman encoding is performed on the amplitude quantification indices of amplitude of all encoding subbands of the extended layer, is less than the total number of bits consumed after natural encoding is performed in the quantification indexes of amplitude envelopes of all sub-bands of extended layer coding, Huffman coding is used, otherwise, natural coding is used, and the Huffman coding indicator of amplitude envelopes is defined. expanded layer coding subbands.
[0006]
6. METHOD, according to claim 1, characterized by the quantification and coding of the frequency domain coefficients of the core layer comprise: the execution of Huffman coding in all the quantification indices of the core layer, which are obtained through the use of a quantification of structure vectors by pyramid cells; if the total number of bits consumed after Huffman coding is performed on all quantification indices obtained by using the pyramid cell structure vector quantization is less than the total number of bits consumed after natural coding is performed on all quantification indexes obtained by using the quantification of structure vectors by pyramid cells, using Huffman coding, correcting the bit allocation numbers of the coding sub-bands using the number of bits guarded by the Huffman coding, the number bits remaining after the first bit allocation, and the total number of bits saved by encoding all encoding subbands, in which the number of bits allocated to a single frequency domain coefficient is 1 or 2, and if performs vector quantization and Huffman coding again in the coding subbands for which the bit allocation numbers are corrected; otherwise, natural encoding is used, the bit allocation numbers of the coding subbands are corrected using the number of bits remaining after a first bit allocation and the total number of bits saved by the encoding of all subbands coding in which the number of bits allocated to a single frequency domain coefficient is 1 or 2, and vector quantization and natural coding are performed again in the coding sub-bands for which the bit allocation numbers are corrected; and the quantification and encoding of the extended layer coding signals comprise: the execution of Huffman coding in all the quantification indices of the extended layer, which are obtained through the use of a quantification of structure vectors by pyramid cells; if the total number of bits consumed after Huffman coding is performed on all quantification indices obtained by using the pyramid cell structure vector quantization is less than the total number of bits consumed after natural coding is performed on all quantification indexes obtained by using the quantification of structure vectors by pyramid cells, using Huffman coding, correcting the bit allocation numbers of the coding sub-bands using the number of bits saved by the Huffman coding, the number of bits remaining after a first bit allocation, and the total number of bits saved by encoding all encoding subbands, in which the number of bits allocated to a single frequency domain coefficient is 1 or 2, and vector quantification and Huffman coding are performed again in the coding sub-bands for which the bit allocation numbers are corrected; otherwise, natural encoding is used, the bit allocation numbers of the encoding subbands are corrected using the number of bits remaining after a first bit allocation and the total number of bits saved by encoding all encoding subbands. in which the number of bits allocated to a single frequency domain coefficient is 1 or 2, and vector quantization and natural coding are performed again on the coding sub-bands for which the bit allocation numbers are corrected.
[0007]
7. HIERARCHICAL AUDIO DECODING METHOD, characterized by comprising: demultiplexing a bit stream transmitted by an encoding terminal, decoding the encoded bits of encoding amplitudes of encoding sub-bands of the core layer and encoding sub-bands the expanded layer, to obtain quantification indices of amplitude envelopes of the coding subbands of the core layer and coding subbands of the extended layer; if the transient detection information indicates a transient signal, the quantification indices for the amplification of amplitudes of the coding sub-bands of the core layer and of the coding sub-bands of the extended layer are reorganized, respectively, so that their corresponding frequencies are aligned from low to high, within the respective layers; executing a bit allocation in the core layer coding subbands according to the amplitude envelope quantization indices of the core layer coding subbands, thereby calculating the amplitude envelope quantization indices of residual signals from the core layer, and the execution of bit allocation in the coding sub-bands of the amplified layer coding signals, according to the quantification indexes of amplitude of the amplitude of the residual signals of the core layer and the indices quantification of amplitude surroundings of the coding sub-bands of the extended layer; decoding the encoded bits of the core layer frequency domain coefficients and encoded bits of the encoded signals of the extended layer, respectively, according to the bit allocation numbers of the encoding subbands of the core layer and the subbands encoding signals of the extended layer encoding signals to obtain the frequency domain coefficients of the core layer and encoding signals of the extended layer, and rearranging the encoding signals of the extended layer in an order of the subbands and adding the same with the frequency domain coefficients of the core layer, to obtain frequency domain coefficients with full bandwidth; and if the transient detection information indicates a steady state signal, an inverse time frequency transformation is performed directly on the frequency domain coefficients with full bandwidth, to obtain an audio signal for output; and if the transient detection information indicates a transient signal, the frequency domain coefficients of the total bandwidth are reorganized, then dividing them into groups of frequency domain coefficients M, the inverse time frequency transformation is performed in each group of frequency domain coefficients, and the calculation is made to obtain a final audio signal according to the groups M of the time domain signals obtained by the transformation.
[0008]
METHOD, according to claim 7, characterized in that, if the transient detection information indicates the transient signal, the reorganization of the frequency domain coefficients of the total bandwidth comprises: the reorganization of the frequency domain coefficients so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies within the respective subframes, to obtain M groups of frequency domain coefficients, and then organize the M groups of frequency domain coefficients in an order of subplots.
[0009]
9. HIERARCHICAL AUDIO CODING METHOD FOR TRANSITIONAL SIGNS, characterized by comprising: the division of an audio signal into M subframes, the execution of a time frequency transformation in each subframe, the M groups of the frequency domain coefficients obtained by the transformation constituting total frequency domain coefficients of a current frame, the reorganization of the total frequency domain coefficients so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies, where the total coefficients frequency domain coefficients comprise core layer frequency domain coefficients and extended layer frequency domain coefficients, coding subbands comprise core layer coding subbands and expanded layer coding subbands, the core domain frequency domain coefficients constitute several codeband sub-bands cation of the core layer, and the frequency domain coefficients of the extended layer constitute several coding sub-bands of the extended layer; the quantification and encoding of amplitude enveloping values of the encoding subbands of the core layer and encoding subbands of the extended layer, to obtain quantification indices of amplitude envelopes and encoded bits of the encoding subband of the layer core and sub-bands encoding the extended layer; where the values of amplitude envelopes of the coding subbands of the core layer and of the coding subbands of the extended layer are quantified separately, respectively, and the indices of quantification of amplitude envelopes of the coding subbands of the core layer and the quantification indices of amplitude envelopes of the coding sub-bands of the extended layer are reorganized respectively; the execution of a bit allocation in the coding subbands of the core layer, according to the quantification indexes of amplitude of the encoding subbands of the core layer, and then quantifying and coding the domain coefficients of core layer frequency to obtain encoded bits of the core layer frequency domain coefficients; the inverse quantification of the frequency domain coefficients described above in the core layer, which are performed with a vector quantification, and the performance of a differential calculation with original frequency domain coefficients, which are obtained after being performed with the transformation of the time frequency, to obtain residual signals from the core layer; the calculation of the quantification indexes of the amplitude of the encoding subband of the residual signals of the core layer, according to the indices of quantification of the amplitude of the amplitude of the coding subband of the core layer and the bit allocation numbers of the core layer encoding subbands; the execution of a bit allocation in the coding sub-bands of the encoded signals of the extended layer according to the quantification indices of amplitude evolvent of the residual signals of the core layer and with the quantification indices of the amplitudes of the amplitudes of the subband encoding of the extended layer, and then quantifying and encoding the encoding signals of the extended layer to obtain encoded bits of the encoding signals of the extended layer, wherein the encoding signals of the extended layer are comprised of the residual core layer signals and frequency domain coefficients of the extended layer; and the multiplexing and packaging of the encoded bits of amplitude envelopes of the encoding subbands of the core layer and encoding subbands of the extended layer, the encoded bits of the frequency domain coefficients of the core layer and the encoded bits of the encoding signals from the extended layer and then transmitting to a decoding terminal.
[0010]
10. METHOD, according to claim 9, characterized in that the frequency domain coefficients are reorganized so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies inside the core layer and inside the layer magnified respectively.
[0011]
11. METHOD according to claim 10, characterized in that, when the respective reorganization is carried out within the core layer and within the extended layer, if the frequency domain coefficients that remained in a group are not sufficient to constitute a sub -band, then a supplement is performed using frequency domain coefficients with the same frequencies, or with similar frequencies, in the next group of frequency domain coefficients.
[0012]
12. METHOD, according to claim 9 or 10, characterized by the indexes of the frequency domain coefficients in the coding sub-bands, are as follows after the reorganization:
[0013]
13. HIERARCHICAL DECODING METHOD FOR TRANSITIONAL SIGNS, characterized by comprising: the demultiplexing a bit stream transmitted by an encoding terminal, the decoding of the encoded bits of encoding amplitudes of the encoding sub-bands of the core layer and coding bands of the extended layer, in order to obtain quantification indices of amplitude envelopes of the coding subbands of the core layer and coding subbands of the extended layer, the reorganization of the quantification indices of the amplitudes of the amplitudes of the sub- coding bands of the core layer and sub-bands of the extended layer coding, respectively, so that their corresponding frequencies are aligned from low to high, within the respective layers; the execution of a bit allocation in the coding subbands of the core layer, according to the amplitude quantification indexes reorganized from the coding subbands of the core layer, and thus calculating the quantization indices of envelopes amplitude of residual signals from the core layer; the execution of bit allocation in the coding sub-bands of the core layer, according to the quantification indexes of amplitude envelopes of residual signals of the core layer, and the quantification indices of amplitude envelopes reorganized of the sub-bands encoding of the extended layer; the decoding of encoded bits of core layer frequency domain coefficients and encoded bits of encoded signals of the extended layer, respectively, according to bit allocation numbers of the encoding subbands of the core layer and encoding subbands of the encoding signals of the extended layer, to obtain the coefficients of the frequency domain of the core layer and encoding signals of the extended layer, and rearrange the encoding signals of the extended layer in an order of the subbands and add them with the frequency domain coefficients of the core layer, to obtain frequency domain coefficients of total bandwidth; and reorganizing the frequency domain coefficients of the total bandwidth and then dividing into groups M, performing an inverse time frequency transformation on each group of frequency domain coefficients, and doing the calculation to get a signal of final audio according to the M groups of time domain signals obtained by the transformation.
[0014]
14. METHOD according to claim 13, characterized by the step of reorganizing the frequency domain coefficients of the total bandwidth comprises: the reorganization of frequency domain coefficients so that their corresponding coding subbands are aligned from low frequencies to high frequencies within the respective subframes, in order to obtain M groups of frequency domain coefficients, and then organize the M groups of frequency domain coefficients in an order of subplots.
[0015]
15. HIERARCHICAL AUDIO CODING SYSTEM, characterized by comprising: a unit of generation of frequency domain coefficients, a unit of calculation of amplitude envelopes, a unit of encoding and quantification of amplitude envelopes, a unit of bit allocation the core layer, a unit for encoding and quantifying vectors of frequency domain coefficients of the core layer, a multiplexer of bit streams; and further comprises: a transient detection unit, a unit of generation of encoding signals of the extended layer, a unit of generation of surrounds of amplitude of residual signals, a unit of bit allocation of the expanded layer, and a unit of encoding and vector quantification of encoding signals from the extended layer; wherein the transient detection unit is configured to perform a transient detection on an audio signal of a current frame; the frequency domain coefficient generating unit is connected to the transient detection unit, and is configured for: when transient detection has to be a steady state signal, a time frequency transformation into an audio signal is performed to obtain total frequency domain coefficients; when the transient detection is to be a transient signal, the audio signal is divided into M subframes, the time frequency transformation is performed in each subframe, the total frequency domain coefficients of the current frame are constituted by the M group of coefficients frequency domain obtained by the transformation, the total frequency domain coefficients are reorganized so that their corresponding coding sub-bands are aligned from low frequencies to high frequencies, in which the total frequency domain coefficients comprise coefficients frequency domain of the core layer and frequency domain coefficients of the extended layer, the coding subbands comprise coding subbands of the core layer and coding subbands of the extended layer, the frequency domain coefficients of the core layer constitute several coding sub-bands of the core layer, and the domain coefficients frequency of the extended layer constitute several coding sub-bands of the extended layer; the amplitude envelope calculation unit is connected to the frequency domain coefficient generation unit and is configured to calculate amplitude envelope values for the core layer coding sub-bands and extended layer coding sub-bands; the amplitude envelope encoding and quantification unit is connected to the amplitude envelope calculation unit and the transient detection unit and is configured for the quantification and encoding of amplitude enveloping values of the core and sub layer coding subbands - coding bands of the extended layer, to obtain quantification indexes of amplitude envelopes and coded bits of amplitude envelopes of the coding sub-bands of the core layer and coding sub-bands of the extended layer; where, if the signal is the steady state signal, the amplitude envelope values of the core layer coding subbands and the extended layer coding subbands are quantified together, and if the signal is the transient signal , the amplitude envelope values of the core layer coding sub-bands and the extended layer coding sub-bands are quantified separately, respectively, and the amplitude envelope quantification indices of the coding sub-bands of the core layer and the quantification indexes of amplitude envelopes of the coding sub-bands of the extended layer are reorganized, respectively; the core layer bit allocation unit is connected to the amplitude envelope encoding and quantization unit and is configured to perform bit allocation in the core layer coding subbands, according to the quantization indices of amplitude envelopes of the core layer coding subbands to obtain bit allocation numbers of the core layer coding subbands; the encoding and quantizing unit of frequency domain coefficients vectors of the core layer is connected to the generating unit of frequency domain coefficients, to the encoding and quantifying unit of amplitude envelopes and to the bit allocation unit of the core layer and is configured to: perform normalization, vector quantization and coding on the frequency domain coefficients of the core layer coding sub-bands, using the bit allocation numbers of the sub-bands of core layer encoding and the quantized amplitude envelope values of the core layer encoding subband reconstructed according to the amplitude envelope quantification indices of the core layer encoding subband to obtain encoded bits the frequency domain coefficients of the core layer; the extended layer encoding signal generation unit is connected to the frequency domain coefficient generation unit and to the core layer frequency domain coefficient vector encoding and quantification unit, and is configured to generate residual signals the core layer, to obtain encoding signals from the extended layer comprised of the residual signals from the core layer and frequency domain coefficients from the extended layer; the residual signal amplitude envelope generation unit is connected to the amplitude envelope encoding and quantification unit and the core layer bit allocation unit, and is configured to obtain residual amplitude envelope quantification indexes the core layer, according to the quantification indices of amplitude envelopes of the coding subbands of the core layer and bit allocation numbers of the corresponding coding subbands of the core layer; the extended layer bit allocation unit is connected to the residual signal amplitude envelope generation unit and the amplitude envelope encoding and quantization unit, and is configured to perform bit allocation in the coding subbands of the encoding signals of the amplified layer, according to the quantification indices of amplitude of the residual signals of the core layer, and quantification indices of amplitude of the encoding sub-bands of the amplified layer, to obtain the numbers bit allocation of the encoding subbands of the encoded signals of the extended layer; the encoding and quantizing unit of amplified encoding signal vectors is connected to the amplitude encoding and quantizing unit of amplitude, bit allocation unit of the amplified layer, residual signal amplitude generating unit, and encoding signal generation unit of the extended layer, and is configured to: perform normalization, vector quantization and encoding in the encoding signals of the extended layer using the bit allocation numbers of the encoding signal subband. encoding of the amplified layer and the quantized amplitude envelope values of the encoding subbands of the encoding signals of the extended layer, reconstructed according to the envelope quantification indices of the amplitude of the encoding subband of the encoding signals of the layer enlarged, to obtain encoded bits of the encoded signals of the enlarged layer; the bitstream multiplexer is connected to the amplitude encoding and quantizing encoding unit, the vector encoding and quantifying unit of core layer frequency domain coefficients, encoding unit and vector quantization signal signals. encoding of the extended layer, and is configured to package bits of secondary information from the core layer, the encoded bits of amplitude envelopes of the encoding subbands of the core layer, the encoded bits of the frequency domain coefficients of the core layer , secondary information bits of the extended layer, encoded bits of encoding subband amplitude surrounds of the extended layer and the encoded bits of the encoded signals of the extended layer.
[0016]
16. SYSTEM, according to claim 15, characterized in that the frequency domain coefficient generation unit is still configured for: when the frequency domain coefficients are reorganized, it reorganizes the frequency domain coefficients respectively, so that their corresponding coding subbands are aligned from low frequencies to high frequencies within the core layer and within the extended layer.
[0017]
17. SYSTEM, according to claim 16, characterized in that, during the respective reorganization inside the core layer and inside the extended layer, if the frequency domain coefficients that remained in a group are not sufficient to constitute a sub -band, then a supplement is performed using frequency domain coefficients with the same frequencies, or with similar frequencies, in the next group of frequency domain coefficients.
[0018]
18. SYSTEM, according to claim 15 or 16, characterized by the indices of the frequency domain coefficients in the coding sub-bands, being as follows after the reorganization:

类似技术:

公开号 | 公开日 | 专利标题

BR112012021359B1|2020-12-15|HIERARCHICAL AUDIO CODING METHOD, HIERARCHICAL AUDIO DECODING METHOD, HIERARCHICAL AUDIO CODING METHOD FOR TRANSITIONAL SIGNALS, HIERARCHICAL AUDIO SODIFICATION METHOD, EARLY CHARACTERISTICS

RU2509380C2|2014-03-10|Method and apparatus for hierarchical encoding and decoding audio

US10546592B2|2020-01-28|Audio signal coding and decoding method and device

BR112012009714B1|2021-01-19|lattice vector quantization audio coding method, lattice vector quantization audio decoding method, lattice vector quantization audio quotation system and lattice vector quantization audio decoding system

TWI590233B|2017-07-01|Decoder and decoding method thereof, encoder and encoding method thereof, computer program

TW200935403A|2009-08-16|Technique for encoding/decoding of codebook indices for quantized MDCT spectrum in scalable speech and audio codecs

JP2018205766A|2018-12-27|Method, encoder, decoder, and mobile equipment

TW201717194A|2017-05-16|Apparatus and methods to perform huffman coding

KR20100113172A|2010-10-20|Reduced-complexity vector indexing and de-indexing

KR101736705B1|2017-05-16|Bit allocation method and device for audio signal

WO2013051210A1|2013-04-11|Encoding device and encoding method

WO2014063489A1|2014-05-01|Bit allocation method and device for audio signal

BR112016021165B1|2020-11-10|audio decoding devices and methods and recording media

TW201209805A|2012-03-01|Device and method for efficiently encoding quantization parameters of spectral coefficient coding

CN105723454A|2016-06-29|Energy lossless coding method and device, signal coding method and device, energy lossless decoding method and device, and signal decoding method and device

US8924208B2|2014-12-30|Encoding device and encoding method

US10102864B2|2018-10-16|Method and apparatus for coding or decoding subband configuration data for subband groups

BR112012012573B1|2021-10-19|METHOD AND SYSTEM OF ENCODING, HIERARCHICAL AUDIO DECODING

JP5544371B2|2014-07-09|Encoding device, decoding device and methods thereof

BR112015025009B1|2021-12-21|QUANTIZATION AND REVERSE QUANTIZATION UNITS, ENCODER AND DECODER, METHODS FOR QUANTIZING AND DEQUANTIZING

BRPI0317954B1|2017-01-03|Variable rate audio coding and decoding process

同族专利:

公开号 | 公开日

CN102222505A|2011-10-19|

US20120323582A1|2012-12-20|

WO2011127757A1|2011-10-20|

EP2528057B1|2016-04-06|

US8874450B2|2014-10-28|

EP2528057A4|2014-08-06|

HK1179402A1|2013-09-27|

CN102222505B|2012-12-19|

EP2528057A1|2012-11-28|

RU2522020C1|2014-07-10|

RU2012136397A|2014-05-20|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US5502789A|1990-03-07|1996-03-26|Sony Corporation|Apparatus for encoding digital data with reduction of perceptible noise|

CN1062963C|1990-04-12|2001-03-07|多尔拜实验特许公司|Adaptive-block-lenght, adaptive-transform, and adaptive-window transform coder, decoder, and encoder/decoder for high-quality audio|

US5388181A|1990-05-29|1995-02-07|Anderson; David J.|Digital audio compression system|

US5956674A|1995-12-01|1999-09-21|Digital Theater Systems, Inc.|Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels|

US5886276A|1997-01-16|1999-03-23|The Board Of Trustees Of The Leland Stanford Junior University|System and method for multiresolution scalable audio signal encoding|

KR100335609B1|1997-11-20|2002-10-04|삼성전자 주식회사|Scalable audio encoding/decoding method and apparatus|

DE60017825T2|1999-03-23|2006-01-12|Nippon Telegraph And Telephone Corp.|Method and device for coding and decoding audio signals and record carriers with programs therefor|

WO2000060576A1|1999-04-05|2000-10-12|Hughes Electronics Corporation|Spectral phase modeling of the prototype waveform components for a frequency domain interpolative speech codec system|

US6260017B1|1999-05-07|2001-07-10|Qualcomm Inc.|Multipulse interpolative coding of transition speech frames|

US6931373B1|2001-02-13|2005-08-16|Hughes Electronics Corporation|Prototype waveform phase modeling for a frequency domain interpolative speech codec system|

DE60225130T2|2001-05-10|2009-02-26|Dolby Laboratories Licensing Corp., San Francisco|IMPROVED TRANSIENT PERFORMANCE FOR LOW-BITRATE CODERS THROUGH SUPPRESSION OF THE PREVIOUS NOISE|

US7003454B2|2001-05-16|2006-02-21|Nokia Corporation|Method and system for line spectral frequency vector quantization in speech codec|

US7328150B2|2002-09-04|2008-02-05|Microsoft Corporation|Innovations in pure lossless audio compression|

JP2007505346A|2003-09-09|2007-03-08|コニンクリユケフィリップスエレクトロニクスエヌ．ブイ．|Coding of audio signal component of transition|

FI119533B|2004-04-15|2008-12-15|Nokia Corp|Coding of audio signals|

US7895034B2|2004-09-17|2011-02-22|Digital Rise Technology Co., Ltd.|Audio encoding system|

US7386445B2|2005-01-18|2008-06-10|Nokia Corporation|Compensation of transient effects in transform coding|

US7961890B2|2005-04-15|2011-06-14|Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung, E.V.|Multi-channel hierarchical audio coding with compact side information|

EP1959433B1|2005-11-30|2011-10-19|Panasonic Corporation|Subband coding apparatus and method of coding subband|

US8417532B2|2006-10-18|2013-04-09|Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.|Encoding an information signal|

CN101206860A|2006-12-20|2008-06-25|华为技术有限公司|Method and apparatus for encoding and decoding layered audio|

JP5255575B2|2007-03-02|2013-08-07|テレフオンアクチーボラゲットエルエムエリクソン（パブル）|Post filter for layered codec|

US8290782B2|2008-07-24|2012-10-16|Dts, Inc.|Compression of audio scale-factors by two-dimensional transformation|

CN101414864B|2008-12-08|2013-01-30|华为技术有限公司|Method and apparatus for multi-antenna layered pre-encoding|CA2832032C|2011-04-20|2019-09-24|Panasonic Corporation|Device and method for execution of huffman coding|

KR102053900B1|2011-05-13|2019-12-09|삼성전자주식회사|Noise filling Method, audio decoding method and apparatus, recoding medium and multimedia device employing the same|

JP5807453B2|2011-08-30|2015-11-10|富士通株式会社|Encoding method, encoding apparatus, and encoding program|

EP2717265A1|2012-10-05|2014-04-09|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Encoder, decoder and methods for backward compatible dynamic adaption of time/frequency resolution in spatial-audio-object-coding|

CN103854653B|2012-12-06|2016-12-28|华为技术有限公司|The method and apparatus of signal decoding|

DK3561808T3|2013-02-05|2021-05-03|Ericsson Telefon Ab L M|Method and device for controlling masking of audio frame loss|

US9665541B2|2013-04-25|2017-05-30|Mozilla Corporation|Encoding video data using reversible integer approximations of orthonormal transforms|

US9560386B2|2013-02-21|2017-01-31|Mozilla Corporation|Pyramid vector quantization for video coding|

WO2015081699A1|2013-12-02|2015-06-11|华为技术有限公司|Encoding method and apparatus|

CN105659321B|2014-02-28|2020-07-28|弗朗霍弗应用研究促进协会|Decoding device and decoding method|

EP3128514A4|2014-03-24|2017-11-01|Samsung Electronics Co., Ltd.|High-band encoding method and device, and high-band decoding method and device|

RU2665898C2|2014-07-28|2018-09-04|Телефонактиеболагет Лм Эрикссон |Pyramidal vector quantizer shape searching|

FR3024581A1|2014-07-29|2016-02-05|Orange|DETERMINING A CODING BUDGET OF A TRANSITION FRAME LPD / FD|

EP2988300A1|2014-08-18|2016-02-24|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Switching of sampling rates at audio processing devices|

EP2993665A1|2014-09-02|2016-03-09|Thomson Licensing|Method and apparatus for coding or decoding subband configuration data for subband groups|

RU2698779C2|2014-09-04|2019-08-29|Сони Корпорейшн|Transmission device, transmission method, receiving device and reception method|

RU2701060C2|2014-09-30|2019-09-24|Сони Корпорейшн|Transmitting device, transmission method, receiving device and reception method|

KR102362788B1|2015-01-08|2022-02-15|한국전자통신연구원|Apparatus for generating broadcasting signal frame using layered division multiplexing and method using the same|

EP3182411A1|2015-12-14|2017-06-21|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus and method for processing an encoded audio signal|

US10210871B2|2016-03-18|2019-02-19|Qualcomm Incorporated|Audio processing for temporally mismatched signals|

US10586546B2|2018-04-26|2020-03-10|Qualcomm Incorporated|Inversely enumerated pyramid vector quantizers for efficient rate adaptation in audio coding|

US10573331B2|2018-05-01|2020-02-25|Qualcomm Incorporated|Cooperative pyramid vector quantizers for scalable audio coding|

US10734006B2|2018-06-01|2020-08-04|Qualcomm Incorporated|Audio coding based on audio pattern recognition|

法律状态:
2018-03-27| B15K| Others concerning applications: alteration of classification|Ipc: G10L 19/00 (2013.01), G10L 19/02 (2013.01) |

2019-01-08| B06F| Objections, documents and/or translations needed after an examination request according art. 34 industrial property law|

2020-01-28| B06U| Preliminary requirement: requests with searches performed by other patent offices: suspension of the patent application procedure|

2020-06-23| B06A| Notification to applicant to reply to the report for non-patentability or inadequacy of the application according art. 36 industrial patent law|

2020-09-29| B09A| Decision: intention to grant|

2020-12-15| B16A| Patent or certificate of addition of invention granted|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 12/01/2011, OBSERVADAS AS CONDICOES LEGAIS. |

优先权:

申请号 | 申请日 | 专利标题

CN201010145531.1|2010-04-13|

CN2010101455311A|CN102222505B|2010-04-13|2010-04-13|Hierarchical audio coding and decoding methods and systems and transient signal hierarchical coding and decoding methods|

PCT/CN2011/070206|WO2011127757A1|2010-04-13|2011-01-12|Hierarchical audio frequency encoding and decoding method and system, hierarchical frequency encoding and decoding method for transient signal|

[返回顶部]