专利摘要:
METHOD AND APPARATUS FOR SWITCHING SPEECH OR AUDIO SIGNS A method and apparatus for switching speech or audio signals is revealed. The method for switching speech or audio signals includes: when switching a speech or audio signal occurs, weighting a first high frequency band signal from a current speech or audio signal frame and a second signal high frequency band of the previous M frame of speech or audio signals to obtain a first processed high frequency band signal (101) and synthesize the first processed high frequency band signal and a first low frequency band signal of the frame current speech or audio signal within a wide frequency band signal (102).
公开号:BR112012013306B1
申请号:R112012013306-3
申请日:2011-04-28
公开日:2020-11-10
发明作者:Zexin LIU;Lei Miao;Chen Hu;Wenhai WU;Yue Lang;Qinz Zhang
申请人:Huawei Technologies Co., Ltd.;
IPC主号:
专利说明:

This application claims priority for the Chinese Patent, entitled "METHOD AND APPARATUS FOR SWITCHING SPEECH OR AUDIO SIGNALS" (Application and Device for Switching Speech or Audio Signals), Application number 201020163406.3, filed on August 28, 2010, which is here incorporated by reference in its entirety. FIELD OF THE INVENTION
The present invention relates to communication technologies and, in particular, to a method and apparatus for switching speech or audio signals. HISTORY OF THE INVENTION
Currently, during the process of transmitting speech or audio signals over a network, as network conditions may vary, the network may intercept the bitstream of speech or audio signals transmitted from an encoder to the network at speeds of different bits, so that the decoder can decode speech or audio signals with bandwidths different from the intercepted bit stream.
In the prior art, since speech or audio signals transmitted over the network have different bandwidths, bidirectional switching from / to a narrow frequency band speech or audio signal to / from a speech or audio signal from wide frequency band may occur during the process of transmitting speech or audio signals. In versions of the present invention, the narrow frequency band signal is switched to a broad frequency band signal with only one low frequency band component through upward sampling and low-pass filtering, the speech or audio signal of Wide frequency band includes both a low frequency band signal component and a high frequency band signal component.
During the implementation of the present invention, the inventor discovers at least the following problems in the prior art. As the high frequency band signal information is available in the speech or audio signals of the broad frequency band, but is absent from the speech or audio signals in the narrow frequency band, when speech or audio signals are wide different bandwidths are switched, there may be a power jump in the speech or audio signals resulting in a feeling of discomfort in listening, and thus reducing the quality of the audio signals received by the user. SYNOPSIS OF THE INVENTION
Versions of the present invention provide a method and apparatus for switching speech or audio signals to smoothly switch speech or audio signals between different bandwidths, thereby improving the quality of the audio signals received by the user.
The method for switching speech or audio signals includes:
a processing module, configured for: when speech or audio switching occurs, weigh a first high frequency band signal from a current frame of speech or audio signals and a second high frequency band signal from frame M previous of the speech or audio signals to obtain a first processed high frequency band signal, where M is equal to 1; and
a first synthesizer module, configured to: synthesize the first processed high frequency band signal and a first low frequency band signal from the current frame of the speech or audio signal within a high frequency band signal.
A method for switching from speech and audio signals includes: when switching from a wide frequency band speech or audio signal to a narrow frequency band speech or audio signal, weighting a first high frequency band signal from a current speech or audio signal frame and a second high frequency band signal from previous M frame of speech or audio signals to obtain a first processed high frequency band signal, where M is equal to 1; and synthesizing the first processed high frequency band signal and a first low frequency band signal from the current speech or audio signal frame within a broad frequency band signal.
An apparatus for switching speech and audio signals includes: a processing module, adapted for: when switching from a broadband frequency speech or audio signal to a narrow frequency band speech or audio signal, consider a first high frequency band signal from a current speech or audio signal frame and a second previous M frame high frequency band signal from speech or audio signals to obtain a first processed high frequency band signal, in that M is equal to 1; and a first synthesizer module, adapted to synthesize the first processed high frequency band signal and a first low frequency band signal from the current speech or audio signal frame within a broad frequency signal.
When using the method and apparatus for switching speech or audio signals in versions of the present invention, the first high frequency band signal of the current speech or audio signal frame is processed according to the second frequency band signal. high of the previous M frame of speech or audio signals, so that the second high frequency band signal of the previous M frame of speech or audio signals can be switched smoothly to the first processed high frequency band signal; the first processed high frequency band signal and the first low frequency band signal are synthesized within a high frequency band signal. In this way, during the process of switching between speech or audio signals with different bandwidths, these speech or audio signals can be switched smoothly, thereby reducing the malevolent impact of the power jump on the subjective audio quality of speech or audio signals and improve the quality of the speech or audio signals received by the user. BRIEF DESCRIPTION OF THE DRAWINGS
To clarify the technical solution of the present invention, the accompanying drawings to illustrate the versions of the present invention are outlined below. Apparently, the accompanying drawings are only exemplary, and those skilled in technology can derive other designs from such accompanying drawings without creative efforts.
Figure 1 is a flow chart of a first version of a method for switching speech or audio signals.
Figure 2 is a flow chart of a second version of the method for switching speech or audio signals.
Figure 3 is a flow chart of a version of step 201 shown in Figure 2.
Figure 4 is a flow chart of a version of step 302 shown in Figure 3.
Figure 5 is a second flow chart of another version of step 302 shown in Figure 3.
Figure 6 is a flow chart of a version of step 202 shown in Figure 2.
Figure 7 is a third flow chart of another version of step 201 shown in Figure 2.
Figure 8 is a third flow chart of another version of step 201 shown in Figure 2.
Figure 9 shows the structure of a first version of an apparatus for switching speech or audio signals.
Figure 10 shows the structure of a second version of the device for switching speech or audio signals.
Figure 11 is a first schematic diagram that illustrates the structure of a processing module in the second version of the apparatus for switching speech or audio signals.
Figure 12 is a schematic diagram illustrating the structure of a first module in the second version of the device for switching speech or audio signals.
Figure 13a is a second schematic diagram illustrating the structure of a processing module in the second version of the apparatus for switching speech or audio signals.
Figure 13b is a third schematic diagram that illustrates the structure of a processing module in the second version of the apparatus for switching speech or audio signals. DETAILED DESCRIPTION OF THE VERSIONS
To facilitate the understanding of the object, technical solution, and merit of the present invention, the following describes the present invention in detail with reference to accompanying versions and drawings. Apparently, the versions are only exemplary and the present invention is not limited to those versions. People with ordinary skill in related technology can derive other versions of the versions given here without making admirable creative efforts, and all of these versions are covered within the scope of the present invention.
Figure 1 is a flow chart of the first version of a method for switching speech or audio signals. As shown in Figure 1, when using the method for switching speech or audio signals, when a speech or audio signal is switched, each frame after the switching frame is processed according to the following steps:
Step 101: When switching over a speech or audio signal, consider the first high frequency band signal of the current speech or audio signal frame and the second high frequency band signal of the previous M signal frame speech or audio to obtain a first processed high frequency band signal, where M is greater than or equal to 1.
Step 102: Synthesize the first high frequency band signal and the first low frequency band signal from the current frame of the speech or audio signal into a broad frequency band signal.
In this version, the previous M frame of speech or audio signals refers to the M frame of signs or audio before the current frame. The L frame of speech or audio signals before switching refers to the L frame of speech or audio signals before the switching frame. When switching a speech or audio frame occurs. If the current speech frame is a broadband signal, the speech or audio signal is switched and the current speech frame is the switching frame.
When using the method to switch speech or audio signals in this version, the first high frequency band signal of the current speech or audio signal frame is processed according to the second high frequency band signal of the previous M frame of speech or audio signals, so that the second high frequency band signal of the previous M frame of speech or audio signals can be smoothly switched to the first processed high frequency band signal. In this way, during the process of switching between speech or audio signals with different bandwidths, the high frequency band signal of these speech or audio signals can be switched smoothly. Finally, the first processed high frequency band signal and the first low frequency band signal are synthesized within a broad frequency band signal, the broad frequency band signal is transmitted to the user's terminal, so that the user enjoys a high quality speech or audio signal. By using the method to switch speech or audio signals in this version, speech or audio signals with different bandwidths can be switched smoothly, thereby reducing the impact of the rising energy change on the subjective audio quality of the audio signals. speech or audio and improving the quality of the speech or audio signals received by the user.
Figure 2 is a flow chart of the second version of the method for switching speech or audio signals. As shown in Figure 2, the method includes the following steps:
Step 200: When switching the speech or audio signal does not occur, synthesize the first high frequency band signal from the current speech or audio signal frame and the first low frequency band signal within a band signal high frequency.
Specifically, the first high frequency band speech or audio signal in this version may be a high frequency band speech or audio signal or a narrow frequency band speech or audio signal. When the first frequency band speech or audio signal is not switched during the transmission of the speech or audio signal, the operation can be performed according to the following two cases: 1. If the first speech or audio signal frequency band audio is a broadband frequency speech or audio signal, the low frequency band signal and the high frequency band signal of the broadband frequency speech or audio signals are synthesized within a broadband frequency signal. 2. If the first frequency band speech or audio signal is a narrow frequency band speech or audio signal, the low frequency band signal and the high frequency band signal of the speech or narrow frequency band audio is synthesized into a wide frequency band signal. In this case, although the signal is a broadband signal, the high frequency band is null.
Step 201: When the speech or audio signal is switched, weigh the first high frequency band signal from the current speech or audio signal frame and the second high frequency band signal from the previous M speech signal frame or audio to get a first processed high frequency band signal. M is greater than or equal to 1.
Specifically, when switching between speech or audio signals with different bandwidths occurs, the first high frequency band signal of the current speech or audio signal frame is processed according to the second high frequency band signal. from the previous M frame of speech or audio signals, so that the second high frequency band signal from the previous M frame of speech or audio signals can be smoothly switched to the first processed high frequency band signal. For example, when the broadband frequency speech or audio signal is switched to the narrowband frequency speech or audio signal, such as the high frequency band signal information corresponding to the speech or audio signal narrowband frequency is null, the high frequency band signal component corresponding to the narrow frequency band speech or audio signal needs to be restored to allow the high frequency band speech or audio signal to be smoothly switched to the speech or audio signal of the narrow frequency band. However, when the narrow-band speech or audio signal is switched to the broad-band speech or audio signal, such as the high-frequency band signal of the speech or audio-band audio signal wide frequency is not null, the energy of the high frequency band signals of the consecutive multi-frame broadband speech or audio signals after switching needs to be weakened to allow the narrow frequency speech or audio signal be switched smoothly to the high frequency band speech or audio signal, so that the high frequency band signal from the high frequency band or speech signal is gradually switched to a high frequency band signal real. When processing the speech or audio signal from the current frame in step 201, the high frequency band signals in the speech or audio signals with different bandwidths can be smoothly switched, which avoids the user's uncomfortable hearing due to sudden change of energy in the switching process between the high frequency band speech or audio signal and the narrow frequency band speech or audio signal, allowing the user to receive high quality audio signals. To simplify the process of obtaining the first processed high frequency band signal, the first high frequency band signal and the second high frequency band signal of the previous M frame of speech or audio signals can be weighted directly. The weighted result is the first processed high frequency band signal.
Step 202: Synthesize the first processed high frequency band signal and the first low frequency band signal of the current speech or audio signal frame within a broad frequency band signal.
Specifically, after the speech or audio signal from the current frame is processed in step 201, the second high frequency band signal from the previous M frame of speech or audio signals can be smoothly switched to the first frequency band signal. high processed current frame; then, in step 202, the first processed high frequency band signal and the first low frequency band signal of the current speech or audio signal frame are synthesized within a wide frequency band signal, so that speech or audio signals received by the user are always broadband frequency speech or audio signals. In this way, speech or audio signals with different bandwidths are switched smoothly, which helps to improve the quality of the audio signals received by the user.
When using the method to switch speech or audio signals in this version, the first high frequency band signal of the current speech or audio signal frame is processed according to the second high frequency band signal of the previous M frame of speech or audio signals, so that the second high frequency band signal of the previous M frame of speech or audio signals can be switched smoothly up to the first processed high frequency band signal. In this way, during the process of switching between speech or audio signals with different bandwidths, the high frequency band signal of those speech or audio signals can be switched smoothly. Finally, the first processed high frequency band signal and the first low frequency band signal are synthesized within a broad frequency band signal, the broad frequency band signal is transmitted to the user's terminal, so that the user enjoys a high quality speech or audio signal. By using the method for switching speech or audio signals in this version, speech or audio signals with different bandwidths can be switched smoothly, thereby reducing the impact of the rising energy shift on the subjective audio quality of the speech signals. or audio and improving the quality of the audio signals received by the user. In addition, when speech or audio signals with different bandwidths are not switched, the first high frequency band signal and the first low frequency band signal of the current speech or audio signal frame are synthesized within a broadband frequency signal, so that the user can get a high quality audio signal.
According to the previous technical solution, optionally, as shown in Figure 3, when switching from a broadband frequency speech or audio signal to a narrowband frequency speech or audio signal, the step 201 includes the following steps:
Step 301: Predict the thin structure information and the envelope information corresponding to the first high frequency band signal.
Specifically, the speech or audio signal may be divided into fine structure information and envelope information, so that the speech or audio signal can be restored according to the fine structure information and the envelope information. In the process of switching from a broadband frequency speech or audio signal to a narrowband frequency speech or audio signal, as only the low frequency band signal is available in the speech or audio signal of the narrow frequency band and the high frequency band signal is null, to allow the speech or audio signal of the wide frequency band to be switched smoothly to the narrow frequency speech or audio signal, the signal high frequency bandwidth required by the current narrowband audio or speech signal needs to be restored in order to implement smooth switching between speech or audio signals. In step 301, the predicted thin structure information and the envelope information corresponding to the first high frequency band signal of the speech or audio signal of the narrow frequency band are predicted.
To more accurately predict the fine structure information and the envelope information corresponding to the current speech or audio signal frame, the first low frequency band signal of the current speech or audio signal frame can be classified in step 301, and then the predicted fine structure information and the envelope information corresponding to the first high frequency band signal are predicted according to the signal type of the first low frequency band signal. For example, the narrow-band speech or audio signal of the current frame may be a harmonic signal, or a non-harmonic signal or a transient signal. In this case, the fine structure information and the envelope information corresponding to the type of the narrow frequency band speech or audio signal can be obtained, so that the fine structure information and the envelope information corresponding to the band signal high frequency can be predicted with greater precision. The method for switching speech or audio signals in this version does not limit the type of signal of the speech or audio signal of the narrow frequency band.
Step 302: Weight the predicted envelope information and the envelope information from the previous M frame corresponding to the second high frequency band signal from the previous M frame of speech or audio signals to obtain the first envelope information corresponding to the first signal. high frequency band.
Specifically, after the predicted fine structure information and the envelope information corresponding to the first high frequency band signal of the current frame are predicted in step 301, the first envelope information corresponding to the first high frequency band signal can be generated from according to the predicted envelope information and the envelope information of the previous M frame corresponding to the second high frequency band signal of the previous M frame of speech or audio signals.
Specifically, the process of generating the first envelope information corresponding to the first high frequency band signal in step 3 02 can be implemented using the following two modes: 1. As shown in Figure 4, a version of obtaining the first information envelope through step 302 may include the following steps:
Step 401: Calculate a correlation coefficient between the first low frequency band signal and the low frequency band signal in the previous N frame of speech or audio signals according to the first low frequency band signal and the signal low frequency bandwidth of the current speech or audio signal frame is compared to the first low frequency band signal and the low frequency band signal of the previous N frame of speech or audio signals, where N is greater than or equal to 1.
Specifically, the first low frequency band signal of the current speech or audio signal frame is compared with the low frequency band signal of the previous N frame of speech or audio signals to obtain a correlation coefficient between the first low frequency band signal of the current speech or audio signal frame and the low frequency band signal of the previous N frame of speech or audio signals. For example, the correlation between the first low frequency band signal of the current speech or audio signal frame and the low frequency band signal of the previous N frame of speech or audio signals can be determined by judging the difference. between the frequency band of the first low frequency band signal of the current speech or audio signal frame and the same frequency band of the low frequency band signal of the previous N frame of speech or audio signals in terms of the energy size or type of information, so that the desired correlation coefficient can be calculated. The previous N frame of speech or audio signals may be speech or audio signals from the narrow frequency band, speech or audio signals from the broad frequency band, or hybrid signals from speech or audio band signals. narrow frequency and broadband frequency speech or audio signals.
Step 402: Judge whether the correlation coefficient is within a given first limit range.
Specifically, after the correlation coefficient is calculated in step 401, it is judged whether the correlation coefficient is within the given limit range. The purpose of calculating the correlation coefficient is to judge whether the current frame of the speech or audio signal is gradually switched from the previous N frame of speech or audio signals. That is, the purpose is to judge whether their characteristics are the same and then determine the weight of the high frequency band signal from the previous frame in the process of predicting the high frequency band signal of the current speech or audio signal. For example, if the first low frequency band signal of the current speech or audio signal frame has the same energy as the low frequency band signal of the previous speech or audio signal frame and its signal types are the same, this indicates that the previous speech or audio signal frame is highly correlated with the current speech or audio signal frame. Therefore, to accurately restore the first envelope information corresponding to the current speech or audio signal frame, the high frequency band envelope information or the transition envelope information corresponding to the previous speech or audio signal frame. occupies a greater weight; otherwise, if there is a huge difference between the first low frequency band signal of the current speech or audio signal frame and the low frequency band signal of the current speech or audio signal frame and the band signal low frequency of the previous speech or audio signal frame in terms of energy and their signal types are different, this indicates that the previous speech or audio signal is low correlated with the current speech or audio signal frame . Therefore, to accurately restore the first envelope information corresponding to the current speech or audio signal frame, the high frequency band envelope information or transition envelope information corresponding to the previous speech or audio signal frame. takes up less weight.
Step 403: If the correlation coefficient is not within the first limit range given, the weight according to a first set of weight 1 and a first set of weight 2 to calculate the information from the first envelope. The first weight 1 refers to the value of the information weight of the envelope in the previous frame corresponding to the high frequency band signal of the previous frame of speech or audio signal, and the first weight 2 refers to the value of the weight of the envelope information.
Specifically, if the correlation coefficient is determined to be not within the first given limit range, weight according to a first fixed weight 1 and a first fixed weight 2 to calculate the first envelope information. The first weight 1 refers to the weight value of the envelope information in the previous frame corresponding to the high frequency band signal of the previous frame of the speech or audio signal, and the first weight 2 refers to the value of the weight of the envelope information.
Specifically, if the correlation coefficient is determined to be not within the first limit range given in step 402, this indicates that the current frame of speech or audio signal is slightly correlated with the previous frame N of speech or audio signals. . Therefore, the envelope information from the previous M frame or the transient envelope information corresponding to the first frequency band speech or audio signal from the previous M frames or the high frequency band envelope information corresponding to the previous frame of the M signal. speech or audio has a slight impact on the first envelope information. When the first envelope information corresponding to the current speech or audio signal frame is restored, the envelope information from the previous M frame or the transient envelope information corresponding to the first frequency band speech or audio signal in the M frames previous or the envelope information of the high frequency band corresponding to the previous frame of speech or audio signal takes up less weight. Therefore, the first envelope information in the current frame can be calculated according to the first weight set 1 and the first weight 2. The first weight 1 refers to the weight value of the envelope information corresponding to the frequency band signal. high of the previous frame of speech or audio signal. The foregoing speech or audio signal frame may be a broadband frequency speech or audio signal or a processed narrowband frequency speech or audio signal. In the case of the first switching, the previous speech or audio signal frame is the broadband frequency speech or audio signal, while the first weight 2 refers to the weight value of the predicted envelope information. The product of the predicted envelope information and the first weight 2 is added to the product of the envelope information from the previous table and the first weight 1, and the weighted sum is the first envelope information from the current table. In addition, subsequently transmitted speech or audio signals are processed according to this method and weighting. The first envelope information corresponding to the speech or audio signal is restored until a speech or audio signal is switched again.
Step 404: If the correlation coefficient is within the first limit range given, weigh according to a set of second weight 1 and a set of second weight 2 to calculate the transitive envelope information. The second weight 1 refers to the weight value of the envelope information before switching, and the second weight 2 refers to the weight value of the envelope information in the previous M frame, where M is greater than or equal to 1 .
Specifically, if the correlation coefficient is determined to be within the limit range given in step 402, the current speech or audio signal frame has characteristics similar to those of the previous consecutive N speech or audio signal frame, and the first envelope information corresponding to the current speech or audio signal frame is greatly affected by the envelope information from the previous consecutive N frame of speech or audio signals. In view of the authenticity of the previous M-frame envelopes, the transitive envelope information corresponding to the current speech or audio signal frame needs to be calculated according to the previous M-frame envelope information and the envelope information before switching. When the first envelope information of the current speech or audio signal frame is restored, the previous M frame envelope information and the previous L frame envelope information before switching should take a greater weight. Then, the first envelope information is calculated according to the transitive envelope information. The second weight 1 refers to the weight value of the envelope information before switching, and the second weight 2 refers to the weight value of the previous M frame envelope information.
In this case, the product of the envelope information before switching and the second weight 1 are added to the product of the envelope information from the previous M frame and the second weight 2, and the weighted value of the transient envelope information.
Step 405: Decrease the second weight 1 as per the first weight step, and increase the second weight 2 as per the first weight step.
Specifically, while the speech or audio signals are transmitted, the impact of the broadband frequency speech or audio signals before switching on the subsequent narrowband frequency speech or audio signals is gradually diminished. To calculate the information on the first envelope more accurately, adaptation adjustment needs to be made on the second weight 1 and the second weight 2. As the impact of the speech or audio signals of the wide frequency band of the L frame before switching on the subsequent ones speech or audio signals are gradually decreased, the value of the second weight 1 is gradually reduced, while the value of the second weight 2 is gradually increased, thus weakening the impact of the envelope information before switching over the first envelope information. In step 405, the second weight 1 and the second weight 2 can be modified according to the following formulas: New second weight 1 = Old second weight 1 - First weighting step; New second weight 21 - Old second weight 2 + First weight stage, where the first weight stage is a fixed value.
Step 406: Judge whether a fixed third weight 1 is greater than the first weight 1.
Specifically, the third weight 1 refers to the weight value of the transitive envelope information. The impact of the transitive envelope information on the first envelope information in the current frame can be determined by comparing the third weight 1 with the second weight 1. The transitive envelope information is calculated according to the envelope information in the previous M frame and the envelope information before switching. Therefore, the third weight 1 effectively represents the degree of impact that the first envelope information suffers from the envelope information before switching.
Step 407: If the third weight 1 is not greater than the first weight 1, weight according to the first weight 1 set and the first weight 2 to calculate the information from the first envelope.
Specifically, when the third weight 1 is determined to be less than or equal to the first weight 1 in step 406, this indicates that the current speech or audio signal frame is slightly distant from the speech or audio signal L frame. before switching and that the first envelope information is essentially affected by the envelope information in the previous M frame. Therefore, the information on the first envelope of the current table can be calculated according to the first weight 1 set and the first weight 2.
Step 408: If the third weight 1 is greater than the first weight 1, weigh according to the third weight 1 set and the third weight 2 to calculate the information from the first envelope. The third weight 1 refers to the weight value of the transitive envelope information, and the third weight 2 refers to the weight value of the predicted envelope information.
Specifically, if the third weight 1 is determined to be greater than the first weight 1 in step 406, it indicates that the current speech or audio signal frame is closest to the L speech or audio signal frame before switching and that the information from the first envelope is greatly affected by the envelope information prior to switching. Therefore, the information for the first envelope in the current frame needs to be calculated according to the transient envelope information. The third weight 1 refers to the weight value of the transient envelope information, and the third weight 2 refers to the weight value of the predicted envelope information. In this case, the product of the transient envelope information and the third weight 1 is added to the product of the predicted envelope information and the third weight 2, and the weighted value is the information of the first envelope.
Step 409: Decrease the third weight 1 according to the second weight stage, and increase the third weight 2 according to the second weight stage until the third weight 1 is equal to 0.
Specifically, the purpose of modifying the third weight 1 and the third weight 2 in step 409 is the same as that of modifying the second weight 1 and the second weight 2 in step 405, that is, the purpose is to make adaptation adjustment in the third weight 1 and the third weight 2 to calculate the first envelope information more accurately when the impact of the L frame of speech or audio signals before switching on the subsequently transmitted speech or audio signals is gradually decreased. As the impact of the L frame of speech or audio signals before switching on the subsequent speech or audio signals is gradually decreased, the value of the third weight 1 is gradually reduced, while the value of the third weight 2 is gradually increased, so weakening the impact of the envelope information before switching on the first envelope information. In step 409, the third weight 1 and the third weight 2 can be modified according to the following formulas: New third weight 1 = Old third weight 1 - Second weight stage; New third weight 2 = Old third weight 2 + Second weight step, where the second weight step is a fixed value.
The sum of the first weight 1 and the first weight 2 is equal to 1; the sum of the second weight 1 and the second weight 2 is equal to 1; the sum of the third weight 1 and the third weight 2 is equal to 1; the initial value of the third weight 1 is greater than the initial value of the first weight 1; and the first weight 1 and the first weight 2 are first constants.
Specifically, weight 1 and weight 2 in this version effectively represent the percentages of the envelope information before switching and the envelope information from the previous M frame in the first envelope information in the current frame. If the current frame of the speech and audio signal is close to the L frame of speech or audio signals before switching and its correlation is high, the percentage of envelope information before switching is high, while the percentage of previous M frame envelope is low. If the current speech or audio signal frame is slightly distant from the speech or audio signal L frame before switching, this indicates that the speech or audio signal is transmitted steadily over the network, or if the current speech or audio signal frame 24 is slightly correlated with the speech or audio signal frame L before switching, this indicates that the characteristics of the current speech or audio signal frame are already modified. Therefore, if the current frame of speech or audio signal is slightly affected by the L frame of speech or audio signals before switching, the percentage of envelope information before switching is low.
In addition, step 404 can be performed after step 405. That is, the second weight 1 and the second weight 2 can be modified first, and then the transitive envelope information is calculated according to the second weight 1 and the second weight 2. Similarly, step 408 can be performed after step 409. That is, the third weight 1 and the third weight 2 can be modified first, and then the information from the first envelope is calculated according to the third weight 1 and the third weight 2. 2. As shown in Figure 5, another version of obtaining the information from the first envelope through step 302 may also include the following steps:
Step 501: Calculate the correlation coefficient between the first low frequency band signal and the low frequency band signal of the previous frame of speech or audio signal according to the first low frequency band signal of the current frame of speech or audio signal and the low frequency band signal of the previous speech or audio signal frame.
Specifically, to obtain the most accurate first envelope information, the relationship between a frequency band of the first low frequency band signal of the current speech or audio signal frame and the same frequency band of the audio band signal is calculated. low frequency of the previous frame of the speech or audio signal. In this version, "corr" can be used to indicate the correlation coefficient. This correlation coefficient is obtained according to the energy ratio between the first low frequency band signal of the current frame of the speech or audio signal and the low frequency band signal of the previous frame of the speech or audio signal. . If the energy difference is small, the "corr" is large, otherwise the "corr" is small. For the specific process, see the calculations on the correlation of the speech or audio signals from the previous N frame in step 401.
Step 502: Judge whether the correlation coefficient is within a given limit range.
Specifically, after the "corr" value is calculated in step 501, whether the calculated "corr" value is within the second given limit is judged. For example, the second limit range may be represented by cl to c2 in this version.
Step 503: If the correlation coefficient is not within the second limit range given, weight according to the first weight 1 and the first weight 2 to calculate the information from the first envelope. The first weight 1 refers to the weight value of the information from the previous envelope corresponding to the high frequency band signal of the previous frame of speech or audio signal, and the first weight 2 refers to the weight value of the information of the previous envelope. predicted envelope. The first weight 1 and the second weight 2 are fixed constants.
Specifically, when the value "corr" is determined to be less than cl or greater than c2, it is determined that the first envelope information corresponding to the current frame of speech or audio signal is slightly affected by the envelope information of the previous frame of speech or audio signal before switching. Therefore, the information of the first envelope of the current frame is calculated according to the first weight 1 set and the first weight 2. The product of the predicted envelope information and the first weight 2 is added to the product of the envelope information of the previous frame and the first weight 1, and the weighted sum and the first envelope information of the current frame. In addition, subsequently transmitted narrowband speech or audio signals are processed according to this method and weighting. The first envelope information corresponding to the narrowband speech or audio signal is restored until the speech or audio signals with different bandwidths are switched over again. For example, the first weight 1 in this version may be represented by al; the first weight 2 can be represented by bl; the envelope information in the previous table may be represented by pre_fenv; the predicted envelope information may be represented by fenv; and the information on the first envelope can be represented by cur_fenv. In this case, step 503 can be represented by the following formula: cur_fenv = pre_fenv x al + fenv x bl.
Step 504: If the correlation coefficient is within the second limit range, judge whether the second fixed weight 1 is greater than the first weight 1. The second weight 1 refers to the weight value of the envelope information before the corresponding switching to the high frequency band signal of the current speech or audio signal frame before switching.
Specifically, if cl <corr <c2, the degree of impact of the envelope information before switching and the previous frame envelope information on the first envelope information of the current frame can be obtained by comparing the second weight 1 with the first weight 1 .
Step 505: If the second weight 1 is not greater than the first weight 1, weight according to the first weight 1 and the first weight 2 to calculate the information from the first envelope.
Specifically, when the second weight 1 is determined to be less than the first weight 1 at step 504, it indicates that the current frame of the speech or audio signal is slightly distant from the previous frame of the speech or audio signal before the switching and that the information in the first envelope is slightly affected by the envelope information in the previous frame before switching. Therefore, the information on the first envelope of the current table can be calculated according to the first weight 1 set and the first weight 2. In this case, step 505 can be represented by the following formula: cur_fenv = pre_fenv x al + fenv x bl.
Step 506: If the second weight 1 is greater than the first weight 1, weight according to the second weight 1 and the second weight 2 set to calculate the information from the first envelope. The second weight 2 refers to the weight value of the predicted envelope information. For example, the second weight 1 could be represented by a2, and the second weight 2 could be represented by b2.
Specifically, when the second weight 1 is determined to be greater than the first weight 1 in step 504, it indicates that the current speech or audio signal frame is closer to the speech or audio signal of the first frequency band of the previous frame before switching and that the information from the first envelope is greatly affected by the information from the envelope before switching which corresponds to the previous frame of speech or audio signal before switching. Therefore, the information from the first envelope of the current table can be calculated according to the second weight 1 set and the second weight 2. In this case, the product of the predicted envelope information and the second weight 2 is added to the product of the envelope information before switching and the second weight 1, and the weighted sum is the information from the first envelope of the current frame. This envelope information before switching can be represented by com_fenv. In this case, step 506 can be represented by the following formula: cur_fenv = com_fenv x a2 + fenv x b2.
Step 507: Decrease the second weight 1 according to the second weight step, and increase the second weight 1 according to the second weight step.
Specifically, as the speech or audio signals are transmitted, the impact of a speech or audio signal before switching on the subsequent speech or audio signal frame is gradually lessened. To calculate the information on the first envelope more precisely, adaptation adjustment needs to be performed on the second weight 1 and the second weight 2. The impact of the speech or audio signal before switching on the subsequent speech or audio signal frame is gradually diminished, while the impact of the previous speech or audio signal frame near the current speech or audio signal frame is gradually increased. Therefore, the value of the second weight 1 gradually decreases, while the value of the second weight 2 gradually increases. In this form, the impact of the envelope information before switching on the information of the first envelope is weakened, while the impact of the predicted envelope information on the information of the first envelope is enhanced. In step 507, the second weight 1 and the second weight 2 can be modified according to the following formulas: New second weight 1 = Old second weight 1 - First weight step; New second weight 2 = Old second weight 2 + first weight step, where the first weight step is a fixed value.
The sum of the first weight 1 and the first weight 2 is equal to 1; the sum of the second weight 1 and the second weight 2 is equal to 1; the initial value of the second weight 1 is greater than the initial value of the first weight 1.
Step 303: Generate a first high frequency band signal processed according to the information from the first envelope and the predicted fine structure information.
Specifically, after obtaining the information from the first envelope of the current frame in step 302, the first processed high frequency band signal can be generated according to the information on the first envelope and the predicted fine structure information, so that the second high frequency band signal can be smoothly switched to the first processed high frequency band signal.
When using the method for switching speech or audio signals in this version, in the process of switching a speech or audio signal from a speech or audio signal from broadband to a speech or audio signal from a broadband narrow frequency, the first processed high frequency band signal of the current frame is obtained according to the predicted fine structure information and the information of the first envelope. In this way, the second high frequency band signal of the broadband speech or audio signal before switching can be smoothly switched to the first processed high frequency band signal corresponding to the speech or band audio signal. narrow frequency, thus improving the quality of the audio signals received by the user.
Based on the previous technical solution, step 202 shown in Figure 6 includes the following steps:
Step 601: Judge whether the first processed high frequency band signal needs to be attenuated according to the current speech or audio signal frame and the previous speech or audio signal frame before switching.
Specifically, the first high frequency band signal of the narrowband speech or audio signal is null. In the process of switching the broadband frequency speech or audio signal to the narrowband frequency speech or audio signal, to prevent the negative impact of the first processed high frequency band signal to match the speech signal or of restored narrowband audio, the energy of the first processed high frequency band signal is attenuated by frames until the attenuation coefficient reaches a limit given after the number of frames of the extended broadband signal of the speech signal or narrow frequency band audio reaches a given number of frames. The interval between the current speech or audio signal frame and the speech or audio signal of a frame before switching can be obtained according to the current speech or audio signal frame and the speech or audio signal. frame audio before switching. For example, the number of frames of the narrow frequency band speech or audio signal can be recorded using a counter, where the number of frames can be a predetermined value greater than or equal to 0.
Step 602: If the first processed high frequency band signal does not need to be attenuated, synthesize the first processed high frequency band signal and the first low frequency band signal within a broad frequency band signal.
Specifically, if it is determined that the first processed high frequency band signal does not need to be attenuated in step 601, the first processed high frequency band signal and the first low frequency band signal are directly synthesized into a wide frequency.
Step 603: If the first processed high frequency band signal needs to be attenuated, judge whether the attenuation factor corresponding to the first processed high frequency band signal is greater than the limit.
Specifically, the initial value of the attenuation factor is 1, and the threshold is greater than or equal to 0 and less than 1. If it is determined that the first processed high frequency band signal needs to be attenuated in step 601, if the factor attenuation corresponding to the first processed high frequency band signal is greater than a given threshold is judged in step 603.
Step 604: If the attenuation factor is not greater than the given limit, multiply the first processed high frequency band signal by the limit, and synthesize the product and the first low frequency band signal within the broad frequency band signal .
Specifically, if the attenuation factor is determined to be no greater than the limit given in step 603, it indicates that the energy of the first processed high frequency band signal is already attenuated to some degree and that the first band signal is high frequency processed may not cause negative impacts. In this case, this mitigation ratio can be maintained. Then, the first processed high frequency band signal is multiplied by the limit, and then the product and the first low frequency band signal are synthesized into a broad frequency band signal.
Step 605: If the attenuation factor is greater than the given limit, multiply the first processed high frequency band signal by the attenuation factor, and synthesize the product and the first low frequency band signal within the frequency band signal wide.
Specifically, if the attenuation factor is greater than the limit given in step 503, it indicates that the first processed high frequency band signal may cause a weak hearing in the attenuation factor and needs to be attenuated further until it reaches the given limit. Then, the first processed high frequency band signal is multiplied by the attenuation factor, and then the product and the first low frequency band signal are synthesized within the broad frequency band signal.
Step 606: Modify the attenuation factor to decrease the attenuation factor.
Specifically, as speech or audio signals are transmitted, the impact of speech or audio signals prior to switching on subsequent narrowband speech or audio signals is reduced, and the attenuation factor is also gradually reduced.
Optionally, based on the prior technical solution, when switching a narrow frequency band speech or audio signal a broadband speech or audio signal occurs, a version of obtaining the first frequency band signal discharge processed through step 201 includes the following steps, as shown in Figure 7.
Step 701: Weight according to the set fourth weight 1 and the fourth weight 2 to calculate a first processed high frequency band signal. The fourth weight 1 refers to the weight value of the second high frequency band signal, and the fourth weight 2 refers to the weight value of the first high frequency band signal of the current speech or audio signal frame. .
Specifically, in the process of switching the narrow-band speech or audio signal to the broad-band speech or audio signal, such as the high-frequency band signal of the speech or audio-band audio signal. high frequency is not null, but the high frequency band signal corresponding to the narrow frequency band speech or audio signal is null, the energy of the high frequency band signal of the speech signal or frequency band audio Broadband needs to be attenuated to ensure that the narrowband audio or speech signal can be smoothly switched to the broadband audio or speech signal. The product of the second high frequency band signal and the fourth weight 1 is added to the product of the first high frequency band signal and the fourth weight 2, the weighted value is the first processed high frequency band signal.
Step 702: Decrease the fourth weight 1 according to the third weight step, and increase the fourth weight 2 according to the third weight step until the fourth weight 1 is equal to 0. The sum of the fourth weight 1 and the fourth weight 2 is equal to 1.
Specifically, as the speech or audio signals are transmitted, the impact of the narrow frequency band speech or audio signals prior to switching on subsequent broadband speech or audio signals gradually becomes less. Therefore, the fourth weight 1 is gradually reduced, while the fourth weight 2 is gradually increased until the fourth weight 1 is equal to 0 and the fourth weight 2 is equal to 1. That is, the transmitted speech or audio signals are always broadband frequency speech or audio signals.
Similarly, as shown in Figure 8, another version of obtaining the first high frequency band signal processed through step 201 may also include the following steps:
Step 801: Weight according to the fifth weight 1 set and the fifth weight 2 to calculate a first processed high frequency band signal. The fifth weight 1 is the weight value of an established fixed parameter, and the fifth weight 2 is the weight value of the first high frequency band signal of the current speech or audio signal frame.
Specifically, since the first high-frequency band signal of the narrow-band speech or audio signal is null, a fixed parameter may be established to replace the high-frequency band signal of the speech or band audio signal. narrow frequency, where the fixed parameter is a constant greater than or equal to 0, and less than the energy of the first high frequency band signal. The product of the fixed parameter and the fifth weight 1 is added to the product of the first high frequency band signal and the fifth weight 2, the weighted value is the first processed high frequency band signal.
Step 802: Decrease the fifth weight 1 according to the fourth weight step, and increase the fifth weight 2 according to the fourth weight step until the fifth weight 1 is equal to 0. The sum of the fifth weight l and the fifth weight 2 is equal to 1.
Specifically, as the speech or audio signals are transmitted, the impact of the narrow frequency band speech or audio signals before switching or subsequent broadband frequency speech or audio signals gradually becomes less. Therefore, the fifth weight 1 is gradually reduced, while the fifth weight 2 is gradually increased until the fifth weight 1 is equal to 0 and the fifth weight 2 is equal to 1. That is, the transmitted speech or audio signals are always broadband frequency speech or audio signals.
When using the method for switching speech or audio signals in this version, in the process of switching a speech or audio signal from a narrow frequency band speech or audio signal to a speech or audio band signal. wide frequency, the high frequency band signal of the broadband speech or audio signal is attenuated to obtain a processed high frequency band signal. In this way, the high frequency band signal corresponding to the narrow frequency band speech or audio signal before switching can be smoothly switched to the processed high frequency band signal, corresponding to the speech or band audio signal. wide frequency, thus helping to improve the quality of the audio signals received by the user.
In this version, the envelope information can also be replaced by other parameters that can represent the high frequency band signal, for example, a linear prediction coding parameter (LPC) or an amplitude parameter.
Those skilled in the technology will be able to understand that all or part of the steps of the method according to the versions of the present invention may be implemented by a program instructing relevant hardware. The program may be stored on a storage medium read by a computer. When the program processes, the steps of the method according to the versions of the present invention are performed. The storage medium may be a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or a read-only memory on a compact disk (CD-ROM).
Figure 9 shows the structure of the first version of an apparatus for switching speech or audio signals. As shown in Figure 9, the apparatus for switching speech or audio signals includes a processing module 91 and a first synthesizer module 92.
The processing module 91 is adapted to weight the first high frequency band signal of the current speech or audio signal frame and the second high frequency band signal of the previous M frame of speech or audio signals to obtain a first high frequency band signal processed. When a speech or audio signal changes, M is greater than or equal to 1.
The first synthesizer module 92 is adapted to synthesize the first processed high frequency band signal and the first low frequency band signal of the current frame of the speech or audio signal within a broad frequency signal.
In the apparatus for switching speech or audio signals in this version, the processing module processes the first high frequency band signal of the current speech or audio signal frame according to the second high frequency band signal of the M frame previous speech or audio signals, so that the second high frequency band signal can be smoothly switched to the first processed high frequency band signal. In this way, during the process of switching between speech or audio signals with different bandwidths, the high frequency band signal of those speech or audio signals can be smoothly switched. Finally, the first synthesizer module synthesizes the first processed high frequency band signal and the first low frequency band signal within a broad frequency band signal, the broad frequency band signal is transmitted to the user terminal , so that the user enjoys a high quality speech or audio signal. By using the method to switch speech or audio signals in this version, speech or audio signals with different bandwidths can be smoothly switched, thereby reducing the impact of the rising energy shift on the subjective audio quality of the speech signals. or audio and improving the quality of the audio signals received by the user.
Figure 10 shows the structure of the second version of the device for switching speech or audio signals. As shown in Figure 10, the apparatus for switching speech or audio signals in this version is based on the first version, and also includes a second synthesis module 103.
The second synthesis module 103 is adapted to synthesize the first high frequency band signal and the first low frequency band signal within the broad frequency band signal when switching the speech or audio signal does not occur.
In the apparatus for switching speech or audio signals in this version, the second synthesizer module is fixed to synthesize the first low frequency band signal and the first high frequency band signal of the first speech or audio band signals. frequency of the current frame within a broadband signal when switching between speech or audio signals with different bandwidths occurs. In this way, the quality of the speech or audio signals received by the user is improved.
According to the prior technical solution, optionally, when switching from a broadband frequency speech or audio signal to a narrowband frequency speech or audio signal occurs, processing module 101 includes the following modules , as shown in Figure 10 and Figure 11: a 1011 predictor module, adapted to predict thin structure information and envelope information corresponding to the first high frequency band signal; a first generation module 1012, adapted to weight the predicted envelope information and the envelope information from the previous M frame corresponding to the second high frequency band signal from the previous M frame of speech or audio signals to obtain information from the first envelope corresponding to the first high frequency band signal; and a second generation module 1013, adapted to generate a first high frequency band signal processed according to the information from the first envelope and the predicted fine structure information.
In addition, the apparatus for switching speech or audio signals in this version may include a rating module 1010 adapted to classify the first low frequency band signal of the current speech or audio signal frame. The prediction module 1011 is further adapted to predict the fine structure information and the envelope information corresponding to the first low frequency band signal, so that the first processed high frequency band signal can be accurately generated by the first module generation and the second generation module. In this way, the first high frequency band signal can be switched smoothly to the first processed high frequency band signal, thus improving the quality of the speech or audio signals received by the user. In addition, the classification module classifies the first low frequency band signal of the current speech or audio signal frame, the prediction module obtains the predicted fine structure information and the predicted envelope information according to the type of signal. In this way, the predicted fine structure information and the predicted envelope information are more accurate, thus improving the quality of the speech or audio signals received by the user.
Based on the previous technical solution, optionally, the first synthesizer module 102 includes the following modules, as shown in Figure 10 and Figure 12: a first 1021 judging module, adapted to judge whether the first high frequency band signal processed needs to be attenuated according to the current speech or audio signal frame and the previous speech or audio signal frame before switching; a third synthesis module 1022, adapted to synthesize the first processed high frequency band signal and the first low frequency band signal within a broad frequency band signal when the first judgment module 1021 determines that the first high frequency band processed needs to be attenuated; a second synthesis module 1023, adapted to judge whether the attenuation factor corresponding to the first processed high frequency band signal is greater than the limit given when the first judgment module 1021 determines that the first processed high frequency band signal needs be attenuated; a fourth synthesis module 1024, adapted for: if the second judgment module 1023 determines that the attenuation factor is not greater than the given limit, multiply the first high frequency band signal processed by the limit, and synthesize the product and the first low frequency band signal within a wide frequency band signal; a fifth synthesis module 1025, adapted for: if the second judgment module 1023 determines that the attenuation factor is greater than the given limit, multiply the first high frequency band signal by the attenuation factor, and synthesize the product and the first low frequency band signal within a wide frequency band signal; and a first modification module 1026, adapted to modify the attenuation factor to decrease the attenuation factor.
The initial value of the attenuation factor is 1, and the limit is greater than or equal to 0 and less than 1.
When using the device to switch speech or audio signals, the first processed high frequency band signal is attenuated, so that the broad frequency band signal obtained by processing the current frame of the speech or audio signal is more thus improving the quality of the audio signals received by the user.
According to the previous technical solution, optionally, when switching from a narrow frequency band speech or audio signal to a wide frequency band speech or audio signal, the processing module 101 in this version includes the modules following, as shown in Figure 10 and Figure 13a: a first calculating module 1011a, adapted to weight according to a fixed fourth weight 1 and a fourth weight 2 to calculate the first processed high frequency band signal, where the fourth weight 1 refers to the weight value of the first high frequency band signal; and a second modifying module 1012a, adapted to decrease the fourth weight 1 according to the third weight stage, and increase the fourth weight 2 according to the third weight stage until the fourth weight 1 is equal to 0, where the sum of the fourth weight 1 and the fourth weight 2 is equal to 1. Similarly, when switching over a narrow frequency band speech or audio signal, a broadband speech or audio signal occurs, the Processing 101 in this version may also include the following modules, as shown in Figure 10 and Figure 13b: a second calculating module 1011b, adapted to weight according to a fifth weight 1 set and a fifth weight 2 to calculate the first signal processed high frequency band, where the fifth weight 1 refers to the weight value of an established fixed parameter, and the fifth weight 2 refers to the weight value of the first high frequency band signal; and a third modifying module 1012b, adapted to decrease the fifth weight 1 according to the fourth weight stage, and increase the fifth weight 2 according to the fourth weight stage until the fifth weight 1 is equal to 0, where the sum of the fifth weight 1 and the fifth weight 2 is equal to 1, where the established parameter is a fixed constant greater than or equal to 0 and less than the energy value of the first high frequency band signal.
When using the device to switch speech or audio signals in this version, in the process of switching a speech or audio signal from a narrow frequency band speech or audio signal to a speech or audio band signal. wide frequency, the high frequency band signal of the high frequency band speech or audio signal is attenuated to obtain a processed high frequency band signal. In this way, the high frequency band signal corresponding to the narrow frequency band speech or audio signal before switching can be smoothly switched to the processed high frequency band signal corresponding to the speech or audio band audio signal. wide frequency, thus helping to improve the quality of4 audio signals received by the user.
It should be noted that the above versions are merely provided to describe the technical solution of the present invention, but are not intended to limit the present invention. Although the present invention has been described in detail with reference to previous versions, it is apparent that those skilled in the technology can make various modifications and variations to the invention without departing from the spirit and scope of the invention. The invention must cover the modifications and variations provided that fall within the scope of protection defined by the following claims or their equivalents.
权利要求:
Claims (16)
[0001]
1. Method for switching speech or audio signals, characterized by the fact that you understand: when switching from a broadband frequency speech or audio signal to a narrow frequency speech or audio signal, consider ( 101) a first high frequency band signal from a current speech or audio signal frame and a second high frequency band signal from previous M frame of speech or audio signals to obtain a first frequency band signal high processed, where M is equal to 1; and synthesizing (102) the first processed high frequency band signal and a first low frequency band signal from the current speech or audio signal frame within a broad frequency band signal.
[0002]
2. Method according to claim 1, characterized by the fact that it comprises: when there is no switching of speech or audio signal, synthesize the first high frequency band signal and the first low frequency band signal within broadband signal.
[0003]
3. Method according to claim 1 or 2, characterized by the fact that when switching from a broadband frequency speech or audio signal to a narrowband frequency speech or audio signal occurs, the step of weighting the first high frequency band signal of the current speech or audio signal frame and the second high frequency band signal of the previous M frame of speech or audio signals to obtain the first signal band. processed high frequency comprises: predicting the fine structure information and the envelope information corresponding to the first high frequency band signal of the current speech or audio signal frame; weighting the predicted envelope information and the envelope information from the previous M frame corresponding to the second high frequency band signal from the previous M frame of speech or audio signals to obtain information on the first envelope corresponding to the first high frequency band signal ; and generating the first high frequency band signal processed according to the information from the first envelope and the predicted fine structure information.
[0004]
4. Method according to claim 3, characterized in that the step of predicting the fine structure information and the envelope information corresponding to the first high frequency band signal of the current speech or audio signal frame comprises : classify the first low frequency band signal of the current speech or audio signal frame; and predicting the fine structure information and the envelope information according to the signal type of the first low frequency band signal.
[0005]
5. Method according to claim 3, characterized in that the step of weighting the predicted envelope information and the previous M frame envelope information corresponding to the second high frequency band signal of the previous M frame of speech or audio to obtain the first envelope information corresponding to the first high frequency band signal comprises: calculating the correlation coefficient between the first low frequency band signal and the low frequency band signal of the previous N frame of signals speech or audio according to the first low frequency band signal and the low frequency band signal of the previous N frame of speech or audio signals, where N is greater than or equal to 1; judge whether the correlation coefficient is within a given first limit range; if the correlation coefficient is not within the first limit range, weight according to a first fixed weight 1 and a first fixed weight 2 to calculate the information from the first envelope, where the first weight 1 refers to a weight value the front frame envelope information corresponding to the high frequency band signal of a previous speech or audio signal frame and the first weight 2 refers to a weight value of the envelope information; if the correlation coefficient is within the first limit range, weight according to the second weight 1 fixed and the second weight 2 fixed to calculate transitional envelope information, where the second weight 1 refers to an information weight value of envelope corresponding to a high frequency band signal of frame L of speech or audio signals before switching and the second weight 2 refers to the weight value of the envelope information of frame M, where L is greater that or equal to 1; decrease the second weight 1 according to a first weight stage, and increase the second weight 2 according to the first weight stage; judge whether a fixed third weight 1 is greater than the first weight 1; if the third weight 1 is not greater than the first weight 1, weight according to the first weight 1 set and the first weight 2 to calculate the first envelope information; if the third weight 1 is greater than the first weight 1, weight according to the third weight 1 set and the third weight 2 to calculate the first envelope information, where the third weight 1 refers to a weight value of transient envelope information and the third weight 2 refers to a weight value of the predicted envelope information; and decreasing the third weight 1 according to a second weight stage, and increasing the third weight 2 according to the second weight stage until the third weight 1 is equal to 0; where: the sum of the first weight 1 and the first weight 2 is equal to 1; the sum of the second weight 1 and the second weight 2 is equal to 1; the sum of the third weight 1 and the third weight 2 is equal to 1; the sum of the third weight 1 and the third weight 2 is equal to 1; the initial value of the third weight 1 is greater than the initial value of the first weight 1; and the first weight 1 and the first weight 2 are fixed constants.
[0006]
6. Method according to claim 3, characterized in that the step of weighting the predicted envelope information and the envelope information of the previous M frame corresponding to the second high frequency band signal of the previous M frame of speech or audio to obtain the first envelope information corresponding to the first high frequency band signal comprises: calculating a correlation coefficient between the first low frequency band signal of a current frame and the low frequency band signal of a previous speech or audio signal frame according to the first low frequency band signal of the current frame and the low frequency band signal of the previous speech or audio signal frame; judge whether the correlation coefficient is within a given second limit range; if the correlation coefficient is not within the second limit range, weight according to a first fixed weight 1 and a first fixed weight 2 to calculate the information from the first envelope, where the first weight 1 refers to a weight value the predicted envelope information corresponding to a signal from the high frequency band of the previous speech or audio signal frame and the first weight 2 refers to a weight value of the predicted envelope information; and the first weight 1 and the first weight 2 are fixed constants; if the correlation coefficient is within the second limit range, judge whether a second fixed weight 1 is greater than the first weight 1, where the second weight 1 refers to a weight value of the envelope information corresponding to the band signal high frequency of the previous speech or audio signal frame before switching; if the second weight 1 is not greater than the first weight 1, weight according to the first weight 1 set and the first weight 2 to calculate the information from the first envelope; if the second weight 1 is greater than the first weight 1, weight according to the second weight 1 and a second weight 2 set to calculate the information from the first envelope, where the second weight 2 refers to a weight value of predicted envelope information; and decreasing the second weight 1 according to a second weight stage, and increasing the second weight 2 according to the second weight stage; where: the sum of the first weight 1 and the first weight 2 is equal to 1; the sum of the second weight 1 and the second weight 2 is equal to 1; the initial value of the second weight 1 is greater than the initial value of the first weight 1.
[0007]
7. Method according to claim 3, characterized by the fact that the step of synthesizing the first processed high frequency band signal and the first low frequency band signal of the current speech or audio signal frame within the wide frequency band signal comprises: judging whether the first processed high frequency band signal needs to be attenuated according to the current frame of the speech or audio signal and a previous frame of the speech or audio signal before switching; if attenuation is not required, synthesize the first processed high frequency band signal and the first low frequency band signal within the broad frequency band signal; if attenuation is necessary, judge whether an attenuation factor corresponding to the first high frequency band signal is greater than a given limit; if the attenuation factor is not greater than the given limit, multiply the first processed high frequency band signal by the limit, and synthesize the product of the processed high frequency band signal and the limit and the first low frequency band signal within the broadband signal; if the attenuation factor is greater than the given limit, multiply the first processed high frequency band signal by the attenuation factor, and synthesize the product of the first processed high frequency band signal and the attenuation factor and the first low frequency band within the broadband signal; and modify the attenuation factor to decrease the attenuation factor; where: an initial value of the attenuation factor is 1, and the limit is greater than or equal to 0 and less than 1.
[0008]
8. Method according to claim 1 or 2, characterized in that the switching of a narrow frequency band speech or audio signal to a broad frequency band speech or audio signal takes place, the step weighting the first high frequency band signal of the current speech or audio signal frame and the second high frequency band signal of the previous M frame of speech or audio signals to obtain the first high frequency band signal processed comprises: weighting according to a fourth fixed weight 1 and a fourth fixed weight 2 to calculate the first processed high frequency band signal, where the fourth weight 1 refers to a weight value of the second high frequency band signal high frequency and the fourth weight 2 refers to a weight value of the first high frequency band signal; and decrease the fourth weight 1 according to the third weight stage, and increase the fourth weight 2 according to the third weight stage until the fourth weight 1 is equal to 0, where the sum of the fourth weight 1 and the fourth weight 2 is equal to 1.
[0009]
9. Method according to claim 1 or 2, characterized by the fact that when switching from a broadband frequency speech or audio signal to the narrowband frequency speech or audio signal occurs, the step of weighting the first high frequency band signal of the current speech or audio signal frame and the second high frequency band signal of the previous M frame of speech or audio signals to obtain the first signal band. processed high frequency comprises: weighting according to the fifth weight set 1 and the fifth weight 2 set to calculate the first processed high frequency band signal, where the fifth weight 1 refers to a weight value of a fixed parameter established, and the fifth weight 2 refers to a weight value of the first high frequency band signal; and reduce the fifth weight 1 according to the fourth weight stage, and increase the fifth weight 2 according to the fourth weight stage until the fifth weight 1 is equal to 0, where the sum of the fifth weight 1 and the fifth weight 2 is equal to 1; where: the fixed parameter is a constant greater than or equal to 0 and less than an energy value of the first high frequency band signal.
[0010]
10. Apparatus for switching speech or audio signals, characterized by the fact that it comprises: a processing module (91; 101), adapted for: when switching from a speech signal or broadband audio to a narrow frequency band speech or audio signal, weighting a first high frequency band signal from a current speech or audio signal frame and a second high frequency band signal from previous M frame of speech or audio to obtain a first processed high frequency band signal, where M is equal to 1; and a first synthesizer module (92; 102), adapted to synthesize the first processed high frequency band signal and a first low frequency band signal of the current speech or audio signal frame within a signal band signal. wide frequency.
[0011]
Apparatus according to claim 10, characterized by the fact that it comprises: a second synthesis module, adapted to synthesize the first high frequency band signal and the first low frequency band signal within the high band signal. wide frequency when there is no switching of the speech or audio signal.
[0012]
Apparatus according to claim 10 or 11, characterized by the fact that when switching from a broadband speech or audio signal to a narrowband audio or speech signal occurs, the processing module comprises: a prediction module, adapted to predict the fine structure information and the envelope information corresponding to the first high frequency band signal of the current speech or audio signal frame; a first generation module, adapted to weight the predicted envelope information and the envelope information from the previous M frame of speech or audio signals to obtain the information from the first envelope corresponding to the first high frequency band signal; and a second generation module, adapted to generate the first high frequency band signal processed according to the information from the first envelope and the predicted fine structure information.
[0013]
13. Apparatus according to claim 12, characterized by the fact that it still comprises a classification module adapted to classify the first low frequency band signal of the current speech or audio signal frame, in which: the prediction is further adapted to predict the thin structure information and the envelope information according to the signal type of the first low frequency band signal.
[0014]
Apparatus according to claim 13, characterized by the fact that when switching from a narrow frequency band speech or audio signal to the broad frequency band speech or audio signal occurs, the module The processing module comprises: a second calculating module, adapted to weight according to the fifth fixed weight 1 and a fifth fixed weight 2 to calculate the first processed high frequency band signal, where the fifth weight 1 refers to a weight value of an established fixed parameter and the fifth weight 2 refers to a weight value of the first high frequency band signal; and a third modification module, adapted to decrease the fifth weight 1 according to the fourth weight stage, and increase the fifth weight 2 according to the fourth weight stage until the fifth weight 1 is equal to 0, where the sum of the fifth weight 1 and the fifth weight 2 is equal to 1, where the fixed parameter is a constant greater than or equal to 0 and less than the energy value of the first high frequency band signal.
[0015]
15. Apparatus according to claim 12, characterized by the fact that the first synthesizing module comprises: a first judging module, adapted to judge whether the first processed high frequency band signal needs to be attenuated according to the frame current speech or audio signal and a previous speech or audio signal frame before switching; a third synthesis module, adapted to synthesize the first processed high frequency band signal and the first low frequency band signal within the broad frequency band signal when the first judging module determines that the first frequency band signal high processed does not need to be alleviated; a second judging module, adapted to judge whether the attenuation factor corresponding to the first processed high frequency band signal is greater than a given threshold when the first judging module determines that the first processed high frequency band signal needs to be attenuated ; a fourth synthesis module, adapted for: if the second judging module determines that the attenuation factor is not greater than the given limit, multiply the first high frequency band signal processed by the limit, and synthesize the product and the first signal low frequency bandwidth within the broadband signal; a fifth synthesis module, adapted for: if the second judging module determines that the attenuation factor is greater than the given limit, multiply the first high frequency band signal processed by the attenuation factor, and synthesize the product and the first low frequency band signal within the broad frequency band signal; and a first modification module, adapted to modify the attenuation factor to decrease the attenuation factor; where: the initial value of the attenuation factor is 1, and the limit is greater than or equal to 0 and less than 1.
[0016]
16. Apparatus according to claim 10 or 11, characterized by the fact that when switching from a narrow frequency band speech or audio signal to a wide frequency band speech or audio signal occurs, The processing module comprises: a first calculating module, adapted to weight according to a fixed fourth weight 1 and a fixed fourth weight 2 to calculate the first processed high frequency band signal, where the fourth weight 1 refers to a weight value of the second high frequency band signal and the fourth weight 2 refers to a weight value of the first high frequency band signal; and a second modification module, adapted to decrease the fourth weight 1 according to the third weight step, and increase the fourth weight 2 according to the third weight step until the fourth weight 1 is equal to 0, where the sum of the fourth weight 1 and the fourth weight 1 is equal to 1.
类似技术:
公开号 | 公开日 | 专利标题
BR112012013306B1|2020-11-10|method and apparatus for switching speech or audio signals
ES2741849T3|2020-02-12|Procedure and apparatus for processing voice / audio signals
US8000968B1|2011-08-16|Method and apparatus for switching speech or audio signals
Enbom et al.1999|Bandwidth expansion of speech based on vector quantization of the mel frequency cepstral coefficients
US20160148622A1|2016-05-26|Audio signal coding method and apparatus
BRPI0904958B1|2020-03-03|APPARATUS AND METHOD FOR CALCULATING BANDWIDTH EXTENSION DATA USING A TABLE CONTROLLED BY SPECTRAL TILTING
BRPI0306434B1|2018-06-12|AUDIO DECODING DEVICE AND METHOD
US10476769B2|2019-11-12|Selecting a packet loss concealment procedure
KR101888030B1|2018-08-13|Audio coding method and apparatus
WO2012169133A1|2012-12-13|Voice coding device, voice decoding device, voice coding method and voice decoding method
EP2806423B1|2016-09-14|Speech decoding device and speech decoding method
Farsi et al.2009|Improving voice activity detection used in ITU-T G. 729. B
RU2440628C2|2012-01-20|Systems and methods for blocking first packet corresponding to first bit rate in second packet corresponding to second bit rate
JP2004234023A|2004-08-19|Noise suppressing device
US11031021B2|2021-06-08|Inter-channel phase difference parameter encoding method and apparatus
CN105761724B|2021-02-09|Voice frequency signal processing method and device
同族专利:
公开号 | 公开日
AU2011247719A1|2012-06-07|
BR112012013306B8|2021-02-17|
EP2485029A1|2012-08-08|
ES2635212T3|2017-10-02|
JP5667202B2|2015-02-12|
EP3249648B1|2019-01-09|
CN101964189B|2012-08-08|
CN101964189A|2011-02-02|
JP6410777B2|2018-10-24|
BR112012013306A2|2016-03-01|
AU2011247719B2|2013-07-11|
EP2485029B1|2017-06-14|
EP2485029A4|2013-01-02|
JP2015045888A|2015-03-12|
KR101377547B1|2014-03-25|
KR20120074303A|2012-07-05|
JP2013512468A|2013-04-11|
ES2718947T3|2019-07-05|
JP6027081B2|2016-11-16|
EP3249648A1|2017-11-29|
JP2017033015A|2017-02-09|
WO2011134415A1|2011-11-03|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US4330689A|1980-01-28|1982-05-18|The United States Of America As Represented By The Secretary Of The Navy|Multirate digital voice communication processor|
US4769833A|1986-03-31|1988-09-06|American Telephone And Telegraph Company|Wideband switching system|
US5019910A|1987-01-29|1991-05-28|Norsat International Inc.|Apparatus for adapting computer for satellite communications|
FI115329B|2000-05-08|2005-04-15|Nokia Corp|Method and arrangement for switching the source signal bandwidth in a communication connection equipped for many bandwidths|
US7113522B2|2001-01-24|2006-09-26|Qualcomm, Incorporated|Enhanced conversion of wideband signals to narrowband signals|
KR100940531B1|2003-07-16|2010-02-10|삼성전자주식회사|Wide-band speech compression and decompression apparatus and method thereof|
JP2005080079A|2003-09-02|2005-03-24|Sony Corp|Sound reproduction device and its method|
FI119533B|2004-04-15|2008-12-15|Nokia Corp|Coding of audio signals|
US20080249766A1|2004-04-30|2008-10-09|Matsushita Electric Industrial Co., Ltd.|Scalable Decoder And Expanded Layer Disappearance Hiding Method|
CA2575379A1|2004-07-28|2006-02-02|Matsushita Electric Industrial Co. Ltd.|Signal decoding apparatus|
CN101010730B|2004-09-06|2011-07-27|松下电器产业株式会社|Scalable decoding device and signal loss compensation method|
JP5046654B2|2005-01-14|2012-10-10|パナソニック株式会社|Scalable decoding apparatus and scalable decoding method|
US8249861B2|2005-04-20|2012-08-21|Qnx Software Systems Limited|High frequency compression integration|
JP5100380B2|2005-06-29|2012-12-19|パナソニック株式会社|Scalable decoding apparatus and lost data interpolation method|
WO2008103925A1|2007-02-22|2008-08-28|Personics Holdings Inc.|Method and device for sound detection and audio control|
CN100585699C|2007-11-02|2010-01-27|华为技术有限公司|A kind of method and apparatus of audio decoder|
RU2449386C2|2007-11-02|2012-04-27|Хуавэй Текнолоджиз Ко., Лтд.|Audio decoding method and apparatus|
CN101425292B|2007-11-02|2013-01-02|华为技术有限公司|Decoding method and device for audio signal|
CN101964189B|2010-04-28|2012-08-08|华为技术有限公司|Audio signal switching method and device|US10158337B2|2004-08-10|2018-12-18|Bongiovi Acoustics Llc|System and method for digital signal processing|
US10848118B2|2004-08-10|2020-11-24|Bongiovi Acoustics Llc|System and method for digital signal processing|
US10701505B2|2006-02-07|2020-06-30|Bongiovi Acoustics Llc.|System, method, and apparatus for generating and digitally processing a head related audio transfer function|
US8284955B2|2006-02-07|2012-10-09|Bongiovi Acoustics Llc|System and method for digital signal processing|
US10848867B2|2006-02-07|2020-11-24|Bongiovi Acoustics Llc|System and method for digital signal processing|
CN101964189B|2010-04-28|2012-08-08|华为技术有限公司|Audio signal switching method and device|
US8329022B2|2011-02-25|2012-12-11|Panasonic Corporation|Method for quantifying a chemical substance by substitutional stripping voltammetry and a sensor chip used therefor|
US8000968B1|2011-04-26|2011-08-16|Huawei Technologies Co., Ltd.|Method and apparatus for switching speech or audio signals|
CN103295578B|2012-03-01|2016-05-18|华为技术有限公司|A kind of voice frequency signal processing method and device|
CN105761724B|2012-03-01|2021-02-09|华为技术有限公司|Voice frequency signal processing method and device|
CN103516440B|2012-06-29|2015-07-08|华为技术有限公司|Audio signal processing method and encoding device|
CN103971693B|2013-01-29|2017-02-22|华为技术有限公司|Forecasting method for high-frequency band signal, encoding device and decoding device|
US9883318B2|2013-06-12|2018-01-30|Bongiovi Acoustics Llc|System and method for stereo field enhancement in two-channel audio systems|
US9397629B2|2013-10-22|2016-07-19|Bongiovi Acoustics Llc|System and method for digital signal processing|
US9906858B2|2013-10-22|2018-02-27|Bongiovi Acoustics Llc|System and method for digital signal processing|
US20150170655A1|2013-12-15|2015-06-18|Qualcomm Incorporated|Systems and methods of blind bandwidth extension|
CN103714822B|2013-12-27|2017-01-11|广州华多网络科技有限公司|Sub-band coding and decoding method and device based on SILK coder decoder|
KR101864122B1|2014-02-20|2018-06-05|삼성전자주식회사|Electronic apparatus and controlling method thereof|
US10639000B2|2014-04-16|2020-05-05|Bongiovi Acoustics Llc|Device for wide-band auscultation|
US10820883B2|2014-04-16|2020-11-03|Bongiovi Acoustics Llc|Noise reduction assembly for auscultation of a body|
JP6603414B2|2016-02-17|2019-11-06|フラウンホファーゲセルシャフトツールフェールデルンクダーアンゲヴァンテンフォルシュンクエー.ファオ.|Post-processor, pre-processor, audio encoder, audio decoder, and related methods for enhancing transient processing|
AU2019252524A1|2018-04-11|2020-11-05|Bongiovi Acoustics Llc|Audio enhanced hearing protection system|
CN110556116B|2018-05-31|2021-10-22|华为技术有限公司|Method and apparatus for calculating downmix signal and residual signal|
US10959035B2|2018-08-02|2021-03-23|Bongiovi Acoustics Llc|System, method, and apparatus for generating and digitally processing a head related audio transfer function|
法律状态:
2018-07-24| B15K| Others concerning applications: alteration of classification|Free format text: AS CLASSIFICACOES ANTERIORES ERAM: G01L 19/00 , G01L 19/04 , G01L 19/12 Ipc: G10L 19/12 (2013.01) |
2018-07-31| B07A| Application suspended after technical examination (opinion) [chapter 7.1 patent gazette]|
2019-02-19| B09B| Patent application refused [chapter 9.2 patent gazette]|
2019-05-07| B12B| Appeal against refusal [chapter 12.2 patent gazette]|
2020-11-10| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 28/04/2011, OBSERVADAS AS CONDICOES LEGAIS. |
2021-02-17| B16C| Correction of notification of the grant [chapter 16.3 patent gazette]|Free format text: REF. RPI 2601 DE 10/11/2020 QUANTO AO INVENTOR. |
优先权:
申请号 | 申请日 | 专利标题
CN201010163406.3|2010-04-28|
CN2010101634063A|CN101964189B|2010-04-28|2010-04-28|Audio signal switching method and device|
PCT/CN2011/073479|WO2011134415A1|2010-04-28|2011-04-28|Audio signal switching method and device|
[返回顶部]