专利摘要:
motion prediction video block hierarchy a video decoder is configured to obtain an index value for the current video block. the video decoder obtains a partition type for the current video block. the video decoder selects one of a plurality of defined sets of candidate predictive video blocks ordered based on the partition type of the current video block. a video decoder selects a predictive video block from a selected set of the plurality of defined sets of candidate predictive video blocks ordered based on an index value. a video decoder generates a motion vector for the current video block based on the motion information from the predictive video block.
公开号:BR112013021612A2
申请号:R112013021612-3
申请日:2012-02-23
公开日:2020-12-01
发明作者:Yunfei Zheng;Wei-Jung Chien;Marta Karczewicz
申请人:Qualcomm Incorporated;
IPC主号:
专利说明:

'1/43 “MOTION PREDICTION VIDEO BLOCK HIERARCHY" This claim claims the benefit: of the U.S. provisional order
No. 61 / 446,392, filed on February 24, 2011; of the U.S. provisional application
No. 61 / 447,017, filed on February 26, 2011; of the U.S. provisional application
No. 61 / 451,493, filed on March 10, 2011; of the U.S. provisional application
No. 61 / 529,110, filed on August 30, 2011; of the U.S. provisional application
No. 61 / 531,526, filed on September 6, 2011; and the U.S. provisional application
No. 61 / 531,514, deposited on September 6, 2011, each of which was incorporated by it in its entirety for reference purposes.
Field of the Invention This disclosure relates to video encoding.
Description of the Prior Art Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAS), laptop or desktop computers, tablet computers , e-book readers, digital cameras, digital recording devices, digital media players, video game devices, video game consoles, cell phones or radio satellite phones, so-called "smart phones", teleconferencing devices video, streaming video devices, and the like.
Digital video devices implement video compression techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T
"2/43 H.263, ITU-T H.264 / MPEG-4, Part 10, Advanced Video Encoding (AVC), The Highly Effective Video Encoding (HEVC) standard currently under development and extensions to such standards Video devices can transmit, receive, encode, decode and / or store digital video information more effectively by implementing such video compression techniques.
Summary of the Invention In general, this disclosure describes techniques for encoding video data. In one example, a method for decoding video data comprises obtaining an Index value for the current video block, obtaining a partition type for the current video block, selecting one from a plurality of defined sets predictive candidate video blocks based on the type of partition, select a predictive video block from a selected set of the plurality of defined sets of candidate predictive video blocks ordered based on the index value and generate a motion vector for the current video block based on the movement information of the predictive video block.
In another example, a device includes a video decoder configured to obtain an index value for the current video block, obtain a partition type for the current video block, select one from a plurality of defined sets of candidate predictive video blocks sorted based on partition type, select a predictive video block from a selected set of the plurality of defined sets of candidate predictive video blocks ordered based on the index value and generate a motion vector for the current video block based on in the movement information of the predictive video block.
'3/43 In another example, a computer-readable storage medium comprises instructions that, when executed, cause a processor to obtain an index value for the current video block, obtain a partition type for the current video block , select one of a plurality of defined sets of candidate predictive video blocks ordered based on the type of partition, select a predictive video block of a selected set of the plurality of defined sets of candidate predictive video blocks ordered based on the index value and generate a motion vector for the current video unit based on the motion information from the predictive video unit. This disclosure also describes techniques for encoding video data. In one example, a method comprises obtaining a motion vector for the current video block, obtaining a partition type for the current video block, selecting one from a plurality of defined sets of candidate predictive video blocks ordered based on the type of partition, select a predictive video block from a set selected from the plurality of “defined blocks of video blocks - predictive candidate sorted based on the motion vector and generate an index value that identifies the selected predictive video block.
In another example, a device includes a video decoder configured to obtain a motion vector for the current video block, obtain a partition type for the current video block, select one from a plurality of defined sets of candidate predictive video blocks sorted based on partition type, select a predictive video block from a selected set of the plurality of defined sets
'4/43 of candidate predictive video blocks ordered based on the motion vector and generate an index value that identifies the selected predictive video block.
In another example, a computer-readable storage medium comprises instructions that, when executed, cause a processor to obtain a motion vector for the current video block, obtain a partition type for the current video block, select one from a plurality of defined sets predictive candidate video blocks ordered based on the partition type, select a predictive video block from a selected set of the plurality of defined sets of candidate predictive video blocks ordered based on the motion vector and generate a value index that identifies the selected predictive video block.
Details of one or more examples - are presented in the accompanying drawings and in the description that follows. Other features, objects and advantages will be evident with the description and drawings, and with the claims.
Brief Description of the Drawings Figure 1 is a block diagram showing an example of a video encoding and decoding system that can implement the techniques of this development.
Figure 2 is a block diagram showing an example of a video encoder that can implement the techniques of this disclosure.
Figure 3A is a conceptual diagram showing the current video block and an exemplary set of motion prediction video blocks.
Figure 3B is a conceptual diagram showing the current video block and an exemplary set of motion prediction video blocks.
| | '5/43 Figure 3C is a conceptual diagram showing the current video block and an exemplary set of motion prediction video blocks.
Figure 4 is a conceptual diagram showing the temporal relationship between the current video frame and reference video frames.
Figure 5 is a conceptual diagram showing the ordering of a set of motion prediction video blocks based on the temporal relationship with the current video block.
Figure 6 is a conceptual diagram showing examples of candidate video blocks that can be used to generate a set of motion prediction video blocks.
Figure 7 is a conceptual diagram showing an example of a method for searching for candidate video blocks based on criteria for generating a set of motion prediction video blocks.
Figure 8 is a conceptual diagram showing examples of video block partitions.
Figures 9A-9K are conceptual diagrams showing examples that create an ordered hierarchy based on the current video block partition for a set of motion prediction video blocks.
Figure 10 is a flow chart showing a technique for encoding video data.
Figure 11 is a block diagram showing an example of a video decoder unit that can implement the techniques of this technique.
Figure 12 is a flowchart showing a technique for decoding video data.
'6/43: | Detailed Description of the Invention This disclosure describes techniques for generating sets of motion prediction video blocks from candidate video blocks and creating an ordered hierarchy of motion prediction video blocks within a set. A video encoder can encode motion information for the current video block using the ordered hierarchy. For example, a set of candidate video units can include video units adjacent to the current video unit. An ordered set of motion prediction video blocks can be a subset of the adjacent video blocks. The motion information for the current video block can be obtained using the ordered set of motion prediction video blocks, including using one of the following techniques: inherit a motion vector from a prediction video block motion in the ordered set, calculate a motion vector by adding or subtracting residual motion vector information from the motion vector of a motion prediction video block in the ordered set, or calculating a motion vector using the motion vector information of one or more motion prediction video blocks in the ordered set. The use of an ordered hierarchy can allow to obtain bit savings.
In one technique, a set of motion prediction video blocks is generated by analyzing whether the candidate video blocks include specified criteria. For example, multiple video blocks within a temporal or spatial distance from the current video block can be analyzed to determine if any of your reference identification values are within a
'7/43 specified range. In this example, candidate video blocks with a reference identification value equal to the specified value can be included in a set of motion prediction video blocks.
In a technique described by this disclosure, a set of motion prediction video blocks is organized into an ordered hierarchy based on the temporal distance of a reference block associated with each of the movement prediction blocks and a reference block associated with the current video unit that is encoded. In other words, the motion vectors of the motion prediction blocks that point to predictive blocks that are temporally closer to the current video block can be given priority over the motion vectors of the motion prediction blocks that point to more temporally predictive blocks. away from the current video block.
In a technique described by this disclosure, an ordered hierarchy of motion prediction video blocks is created based on the partition type of the current video block. A subset of three motion prediction blocks can be generated from a set of five adjacent video blocks based on the partition type of the current video block. The type of partition may correspond to the form of a PU partition in accordance with the emerging high-efficiency video encoding (HEVC) standard.
Figure 1 is a block diagram showing an example of a video encoding and decoding system 10, which can implement the techniques of this disclosure.
As shown in Figure 1, the System 10 includes a source device 12, which transmits encrypted video to a destination device 16 via a communication channel 15. The source device 12 and the device
'8/43 of destination 16 can comprise any of a wide range of devices. In some cases, the source device 12 and the destination device 16 may comprise telephone devices from wireless communication devices, such as so-called cell phones or radio satellite phones. The techniques of this disclosure, however, which generally apply to encoding and decoding, can be applied to non-wireless devices that include video encoding and / or decoding capabilities. The source device 12 and the target device 16 are merely examples of coding devices that can support the techniques described herein. In the example in Figure 1, The source device 12 includes a video source 20, a video encoder 22, a modulator / demodulator (modem) 23 and a transmitter
24. The target device 16 can include a receiver 26, a modem 27, a video decoder 28 and a display device 30. Syntax elements can be generated in the video encoder 22 as part of an encoded bit stream, and the syntax elements can be used by the video decoder 28 in decoding the bit stream. The video source 20 may comprise a video capture device, such as a video camera, a video file that contains previously captured video, a video feed from a video content provider or other video source. As another alternative, video source 20 can generate data based on computer graphics as the source video or a combination of live video, archived video and computer generated video. In some cases, if the video source 20 is a video camera, the source device 12 and the target device 16 can form so-called
] 9/43 camera phones or video phones. In each case, the captured, pre-captured or computer generated video can be encoded by the video encoder 22. In some examples (but not in all cases), since the video data is encoded by the video encoder 22 , the encoded video information can then be modulated by modem 23 according to a communication standard, such as, for example, code division multiple access (CDMA), orthogonal frequency division multiplexing (OFDM) or any another standard or communication technique. The encoded and modulated data can then be transmitted to the destination device 16 via transmitter 24. Modem 23 can include various mixers, filters, amplifiers or other components designed for signal modulation. Transmitter 24 may include circuits designed to transmit data, including amplifiers, filters and one or more antennas. Receiver 26 of destination device 16 receives information over channel 15, and modem 27 demodulates the information. The video decoding process performed by the video decoder 28 may include techniques corresponding to the encoding techniques performed by the video encoder 22.
The communication channel 15 can comprise any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines or any combination of wireless and wired media. Communication channel 15 can be part of a packet-based network, such as a local area network, an extended area network, or a global network such as the Internet. Communication channel 15 generally represents any suitable means of communication, or collection of means
] 10/43 different communication, to transmit video data from the source device 12 to the destination device
16. Once again, Figure 1 is purely exemplary and the techniques of this disclosure can apply to video encoding configurations (video encoding or video decoding, for example) that do not necessarily include data communication between devices. encoding and decoding. In other examples, data can be retrieved from local memory, streamed over a network or the like. A coding device can encode and store data in memory and / or a decoding device can retrieve and decode data from memory. In many cases, encoding and decoding is performed by unrelated devices that do not communicate with each other, but simply encode data in memory and / or retrieve and decode data from memory. In some cases, video encoder 22 and video decoder 28 can function substantially in accordance with a video compression standard, such as the emerging HEVC standard. However, the techniques of this disclosure can also be applied in the context of several other video encoding standards, including some old standards or new or emerging ones. Although not shown in Figure 1, in some cases the video encoder 22 and video decoder 28 can each be integrated with an audio encoder and decoder and may include appropriate MUX-DEMUX units, or other hardware and software, to process the encoding of both audio and video in a common data stream or in separate data streams. If applicable, MUX-DEMUX units can conform to the multiplexer protocol
'11/43 ITU H.223 Or to other protocols such as the user datagram protocol (UDP).
Video encoder 22 and video decoder 28 can each be implemented as one or more microprocessors, digital signal processors, application-specific integrated circuits (ASICS), field programmable port arrangements (FPGAs), discrete logic , software, hardware, firmware or combinations thereof. Each of video encoder 22 and video decoder 28 can be included in one or more encoders or decoders, one or the other of which can be integrated as part of a combined encoder / decoder (CODEC) in a respective mobile device, device subscriber, broadcast device, server or the like. In this disclosure, the term encoder refers to an encoder, a decoder or CODEC, and the terms encoder, encoder, decoder and CODEC all refer to specific machines designed for encoding (encoding and / or decoding) video data in accordance with this disclosure.
In some cases, devices 12, 16 may function in a substantially symmetrical manner. For example, each of the devices 12, 16 can include video encoding and decoding components. Consequently, system 10 can support unidirectional or bidirectional video transmission between video devices 12, 16, such as, for example, video streaming, video replay, video broadcasting, or video telephony.
During the encoding process, video encoder 22 can perform various encoding techniques or operations. In general, video encoder 22 works
: 12/43 in video data blocks in accordance with the HEVC standard.
HEVC refers to coding units (CUs), which can be partitioned according to a quaternary tree transformed partitioning scheme.
An “LCU” refers to the largest coding unit (the “largest coding unit”, for example) supported in a given situation.
The size of the LCU can itself be signaled as part of the bit stream, for example, as a syntax at the sequence level.
The LCU can be partitioned into smaller CUSs.
CUs can be partitioned into prediction units (PUs) for prediction purposes.
PUs can be square or rectangular in shape.
Transforms are not fixed in the emerging HEVC standard, but are defined according to transformer unit (TU) sizes, which can be the same size as a given CU, or possibly smaller.
Residual samples that correspond to a CU can be subdivided into smaller units using a quaternary tree transform structure known as a “residual quaternary tree transform” (ROT). RQOT leaf nodes can be referred to as transform units (TUS). TUS can be transformed and quantized.
Syntax elements can be defined at the LCU level, the CU level, the PU level and the TU level.
Elements called “split indicators” can be included as a syntax at the CU level to indicate whether any given CU is itself subdivided into four more CUs.
For example, CUO can refer to LCU,
and the CUI to CU4 may comprise LCU sub-CUs.
In accordance with HEVC, video blocks are referred to as encoding units (CUs) and many CUs exist within individual video frames (or other independently defined video units, such as slices). Frames, slices, parts of frames,
13/43 groups of images or other data structures can be defined as units of video information that include a plurality of CUs.
CUs can have variable sizes in accordance with the HEVC standard, and the bit stream can define the largest encoding units (LCUS) as the largest CU size.
With the HEVC standard, LCUS can be divided into smaller and smaller CUSs according to a quaternary tree transformed partitioning scheme, and the different CUs that are defined in the scheme can also be partitioned into so-called prediction units (PUs). LCUs, CUs and PUs are all video blocks within the meaning of this disclosure. the video encoder 22 can perform predictive encoding, in which the video block that is encoded (a PU of a CU within an LCU, for example) is compared with one or more predictive candidates to identify a predictive block.
This predictive encoding process can be intra (and in this case the predictive data is generated based on neighboring intra-data within the same frame or video slice) or inter (in this case the predictive data is generated based on video data previous or subsequent frames or slices). Many different encoding modes can - be supported, and video encoder 22 can select a desirable video encoding mode.
According to this disclosure, at least some video blocks can be encoded using the processes described herein.
Video compression techniques include spatial prediction (intra-image) and / or temporal prediction (inter-image) to reduce or remove the redundancy inherent in video sequences.
For block-based video encoding, a video slice (a video frame or part of a video frame, for example) can be partitioned into
] 14/43 video blocks, which can also be referred to as tree blocks, encoding units (CUS) and / or encoding nodes. The video blocks in an intra-coded slice (I) of an image are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same image. The video blocks in an inter-encoded slice (P or B) of an image are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same image or temporal prediction with respect to reference samples in other images of reference. Images can be referred to as frames, and reference images can be referred to as reference frames. When video encoder 22 uses motion estimation and motion compensation to reduce time redundancy in a video sequence, a motion vector can be generated to identify a predictive block of video data. The video decoder 28 can use the motion vector to predict the values of the current video block that is encoded. For example, the values of the predictive video block can be subtracted from the values of the current video block to produce a residual data block. The motion vector, together with the residual data, can be communicated from the video encoder 22 to the video decoder via communication channel 15. The video decoder 28 can locate the same predictive block (based on the motion vector) and reconstruct the encoded video block by combining the residual data with the predictive block data.
Video encoder 22 can use fusion mode to encode motion information for the current video block. Fusion mode is a video encoding mode in which motion information (such as motion vectors, reference frame indexes, prediction directions or other information) from a neighboring video block is inherited to the video block current that is encoded. An index value can be used to identify the neighbor from which the current video block inherits its movement information (top, top right, left, bottom left or co-located from a temporally adjacent frame, for example). Another case in which the motion vector of a neighboring video block is used in the encoding of the current video block is the so-called motion vector prediction. In this case, predictive motion vector coding is applied to reduce the amount of data needed to communicate the motion vector. For example, instead of encoding and communicating the motion vector itself, video encoder 22 can encode and communicate the motion vector difference (MVD) relative to a known (or knowable) motion vector. In H.264 / AVC, the known motion vector, which can be used with MVD to define the current motion vector, can be defined by the so-called motion vector predictor (MVP), which is derived as the average number of motion vectors associated with neighboring blocks.
Video encoder 22 can use adaptive motion vector prediction (AMVP) to encode motion information from the current video block. AMVP builds a set of motion vector candidates including several neighboring blocks in the spatial and temporal directions as candidates for the MVP. In AMVP, Video encoder 22 selects the most accurate prediction block from the set of candidates based on the analysis of the encoding distortion rate (using the call
Ú 16/43 rate distortion cost analysis, for example). A motion vector predictor index (idx mvp) can be transmitted to the video decoder 28 to tell the video decoder 28 where to locate the MVP. An MVD can also be transmitted to the video decoder
28. Video decoder 28 can combine MVD with MVP (defined by the motion vector predictor index) to generate the motion vector for the current video block.
After the generation of the predictive block, the differences between the current video block that is encoded and the predictive block are encoded as a residual block, and the prediction syntax (such as a motion vector in the case of inter-coding, or a predictive mode in the case of 15) intra-coding) is used to identify the predictive block. Furthermore, with AMVP or the fusion mode, the neighbor block used to identify the predictive block can be encoded, for example, through an index value that identifies a specific neighbor according to the ordered hierarchy described here.
In some cases, the residual block can be transformed and quantized. Transformation techniques can comprise a DCT process or a conceptually similar process, fully transformed, short wave (wavelet) or other types of transform. In a DCT process, as an example, the transform process converts a set of pixel values (residual pixel values, for example) into transform coefficients, which can represent the energy of the pixel values in the frequency domain. The transform coefficients can be quantized. In particular, quantization can be applied to the transform coefficients and generally involves a
'17/43 process that limits the number of bits associated with any given transform coefficient. More specifically, quantization can be applied according to a quantization parameter (QP) defined at the LCU level. Therefore, the same level of quantization can be applied to all transform coefficients in the TUs associated with different PUs of CUs within an LCU. However, instead of signaling the QP itself, a change or difference (ie, a delta) in the OP can be signaled with the LCU to indicate the change in the QP from that of a previous LCU.
Following the transform and quantization, entropy coding can be performed on the quantized and transformed residual video blocks. 15) syntax elements can also be included in the entropy-encoded bit stream. In general, entropy coding comprises one or more processes that collectively compress a sequence of quantized transform coefficients and / or other syntax information. Scanning techniques can be performed on the quantized transform coefficients to define one or more serialized dimensional vectors of coefficients of two dimensional video blocks. The scan coefficients are then encoded by entropy together with any syntax information, such as, for example, context-adaptive variable-length encoding (CAVLC), context-adaptive binary arithmetic (CABAC) or other entropy encoding . as part of the encoding process, the encoded video blocks can be decoded in order to generate the video data that is used for subsequent encoding based on prediction of subsequent video blocks. This is often referred to as a loop
'18/43 decoding the encoding process and generally mimics the decoding that is carried out by a decoding device.
In the decoding loop of an encoder or decoder, filtering techniques can be used to improve video quality and, for example, smooth the boundaries between pixels and possibly remove artifacts from the decoded video.
This filtering can be in loop or post-loop.
With loop filtering, filtering of reconstructed video data occurs in an encoding loop, which means that the filtered data is stored by an encoder or a decoder for subsequent use in the prediction of subsequent image data.
In contrast, with post-loop filtering, filtering of reconstructed video data takes place outside the encoding loop, which means that unfiltered versions of the data are stored by an encoder or decoder for subsequent use in predicting image data. subsequent
Loop filtering often follows a separate unlock filtering process, which typically applies filtering to pixels that are at or near the borders of adjacent video blocks to remove blocking artifacts that manifest at the borders between video blocks.
Figure 2 is a block diagram showing a video encoder 50 compatible with this disclosure.
The video encoder 50 can correspond to the video encoder 22 of the device 12 or the video encoder of a different device.
As shown in Figure 2, Video encoder 50 includes a quaternary tree transformed partition unit 31, a prediction encoding unit 32, a memory 34, a de-transformed module 38, a quantization unit 40, a unit inverse quantization method 42, a transform module
'19/43 inverse 44, an entropy coding unit 46, a filter unit 47, which can include release filters and post-loop and / or loop filters, an adder 48 and an adder 51. Video data and the encoded syntax information that defines the encoding way can be communicated to the entropy encoding unit 46, which performs entropy encoding in the bit stream.
As shown in Figure 2, the prediction encoding unit 32 can support a plurality of different encoding modes 35 in encoding video blocks.
Modes 35 can include inter- | encoding that define predictive data from different video frames (or slices).
Intercoding modes can be bi-predictive, meaning that two different lists (List O and List l1, for example) of predictive data (and typically two different motion vectors) are used to identify the predictive data .
Inter-coding modes can alternatively be uni-predictive, which means that a list (eg List 0, for example) of predictive data (and typically a motion vector) is used to identify the predictive data.
Interpolations, displacements or other techniques can be performed in conjunction with the generation of predictive data.
The so-called SKIP modes and DIRECT modes, which inherit motion information associated with a co-located block from another frame (or slice), can also be supported. JUMP mode blocks do not include residual information, while DIRECT mode blocks include residual information.
In addition, modes 35 can include inter-encoding modes, which define predictive data based on data within the same video frame (or slice) that is
: 20/43 coded. Intra-coding modes can include differential modes that define predictive data based on data in a specific direction within the same frame, as well as DC and / or planar modes that define predictive data based on the mean or weighted average of neighboring data. The prediction encoding unit 32 can select the mode for a given block based on some criteria, such as based on a rate distortion analysis or some characteristics of the block, such as size, texture or other characteristics of the block.
According to this disclosure, the prediction encoding unit 32 supports one or more modes that perform the adaptive motion vector (AMVP) prediction described above or the fusion mode described above. In these or other cases, movement information can be inherited from a block in the manner described here and the signaling of the block from which such inheritance occurs can be carried out in the manner described here.
Generally, during the encoding process, the video encoder 50 receives incoming video data. The prediction encoding unit 32 performs predictive encoding techniques on video blocks (CUs and PUs, for example). The quaternary tree-transformed partition unit 31 can fragment an LCU into smaller CUs and PUs according to HEVC partitioning. For inter-coding, the prediction-coding unit 32 compares the CUs or PUs with multiple predictive candidates on one or more reference frames or slices (one or more reference data “lists”, for example) to define a predictive block . For intra-encoding, the prediction encoding unit 32 generates a predictive block based on neighboring data within the same frame or video slice. The prediction coding unit 32 transmits the
'21/43 prediction block and adder 48 subtracts the prediction block from the CU or PU that is coded to generate a residual block. Again, at least some video blocks can be encoded using the AMVP described here.
In some cases, the prediction encoding unit 32 may include a rate distortion (R-D) unit, which compares the results of encoding video blocks (CUS or PUs, for example) in different modes. In this case, the prediction encoding unit 32 may also include a mode selection unit for analyzing the encoding results in terms of encoding rate (that is, the necessary encoding bits for the block) and distortion (representing the video quality of the encoded block with respect to the original block, for example) to make mode selections for video blocks. In this way, the R-D unit can provide analysis of the results in different modes to allow the mode selection unit to select the desired mode for different video blocks. In accordance with this disclosure, a mode that performs AMVP can be selected when an R-D unit identifies it as the desired mode for a given video block, for example, due to encoding gains or encoding efficiency. Alternatively, in accordance with this disclosure, a fusion mode can be selected in which motion information is inherited from a neighboring block. In these or other examples, an ordered set of neighbors can be defined and used in the coding compatible with this disclosure.
Again with reference to Figure 2, after the prediction encoding unit 32 transmits the prediction block, and after the adder 48 subtracts the prediction block from the video block that is encoded to generate a residual block of residual pixel values. , the
'22/43 transform 38 applies a transform to the residual block. The transform can comprise a discrete cosine transform (DCT) or a conceptually similar transform, such as that defined by the ITU H.264 standard or the HEVC standard. The so-called “butterfly” structures can be defined as executing the transforms, or matrix based multiplication can also be used. In some examples, in accordance with the HEVC standard, the size of the transform can vary for different CUs, depending on the level of partitioning that occurs with respect to a given LCU. Transform units (TUS) can be defined to fix the transform size applied by transform unit 38. Short wave transforms, whole transforms, subband transforms or other types of transforms can also be used. In any case, the transform unit applies the transform to the residual block, producing a block of residual transform coefficients. The transform, in general, can convert residual information from the pixel domain into the frequency domain.
The quantization unit 40 then quantizes the residual transform coefficients to further reduce the bit rate. The quantization unit 40, for example, can limit the number of bits used for each encoding each of the coefficients. In particular, the quantization unit 40 can apply the delta QP defined for the LCU to define the level of quantization to be applied (such as by combining the delta QP with the previous LCU QP or some other known QP). After quantization is performed on residual samples, the entropy coding unit 46 can scan and entropy the data.
'23/43 CAVLC is a type of entropy coding technique supported by the ITU H.264 standard and the emerging HEVC standard, which can be applied on a vectorized basis by the entropy coding unit 46. CAVLC uses coding tables variable length (VLC) in a way that effectively compresses serialized “courses” of coefficients and / or syntax elements. CABAC is another type of entropy coding technique supported by the H.264 standard or the HEVC standard, which can be applied on a vectorized basis by the entropy coding unit 46. CABAC can involve several stages, including binarization, selection of] context model and binary arithmetic coding. In this case, the entropy coding unit 46 encodes coefficients and syntax elements according to CABAC. Many other types of entropy coding techniques also exist, and new entropy coding techniques are likely to emerge in the future. This disclosure is not limited to any specific entropy coding technique.
Following entropy coding by entropy coding unit 46, the encoded video can be transmitted to another device or archived for later transmission or retrieval. The encoded video can comprise the vectors encoded by entropy and various syntax information. Such information can be used by the decoder to properly configure the decoding process. The inverse quantization unit 42 and the inverse transform unit 44 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain. Adder 51 adds the reconstructed residual block to the prediction block
] 24/43 produced by the prediction encoding unit 32 to produce a reconstructed video block for storage in memory 34. Before such storage, however, filter unit 46 can apply filtering to the video block to improve video quality. . The filtering applied by the filter unit 47 can reduce artifacts and smooth the boundaries between pixels. In addition, filtering can improve compression by generating predictive video blocks that closely match the video blocks that are encoded.
According to the techniques described here, the prediction coding unit 32 can use ordered hierarchies to identify a predictive block for encoding the current block and / or it can generate an index value that identifies a specific predictive block according to an ordered hierarchy . Figure 3A is a conceptual diagram showing the current video block and candidate predictive motion video blocks (i.e., top (T), top right (TR), left (L), bottom left ( BL) or co-located with a temporally adjacent frame (Temp)) from which the current video block can derive motion information. Figure 3B is a conceptual diagram showing the current video block and one of a plurality of sets of predictive motion video blocks that can be derived from the set of candidate predictive motion video blocks in Figure 3A. Figure 3C is a conceptual diagram showing the current video block and candidate predictive motion video blocks (ie, top (T), top to left (TL), top to right (TR), left (L) or left base (BL)) from which the current video block can derive motion information.
i 25/43 Figures 4 and 5 are conceptual diagrams showing the use of an ordered hierarchy of motion prediction video blocks to identify a predictive video block for encoding the current video block. In the example, Figures 4 and 5 show the body distance between the current video block and each of the motion prediction video blocks that is used to create an ordered hierarchy. The ordered hierarchy can be created by video encoder 50 based on input video data or created beforehand and stored in memory 34. The creation of an ordered hierarchy based on temporal distance can exploit the fact that it is more likely that motion prediction video blocks that have shorter time distances from the current video block are better predictors than video blocks that have longer time distances. In the example shown in Figures 4 and 5, the set of motion prediction blocks can include the five blocks shown in Figure 3A. In other examples, the set of motion prediction video blocks may include more or less motion prediction video blocks. The size of the set and the motion prediction video blocks included in the set may vary for each current video block. For example, a set of three motion prediction video blocks can be generated using the five video blocks shown in Figure 5.
An image order count (POC) associated with the motion information of a motion prediction video block can be used to define the time distance between each of the motion prediction video blocks and the current video block. In the example shown in Figures 4 and 5, the current video block that is encoded is located in frame 5 (POC = 5). The informations
. 26/43 of motion of the motion prediction video blocks point to frame 0 for block L, for frame 1 for block BL, for frame 2 for block T, for frame 3 for block Temp and for table 4 for block TR.
Therefore, the hierarchy of movement prediction blocks can be defined as follows: TR block followed by Temp block, followed by T block, followed by BL block, followed by L block.
As described above, the prediction encoding unit 32 can use the exemplary Ordered hierarchy shown in Figure 5 to encode motion information for the current video block.
In one example, 1 an ordered hierarchy can be programmed in advance and stored in memory 34. In another example, video encoder 50 can adaptively generate hierarchies by analyzing video data.
Once a hierarchy is determined, each of the motion prediction video blocks can be assigned variable code words' as index values.
The prediction video block which has the highest probability of being the motion prediction video block with the highest rating for a given current video block can be assigned the shortest codeword.
In the example shown in Figure 5, the TR video block can have the shortest codeword.
By assigning index values differently, depending on the ordered hierarchy (time distance of movement information, for example), bit savings can be achieved.
In some cases, variable length codes can be used to assign shorter codes to motion prediction video blocks with better correlation (in terms of the time distance of motion information, for example). In other cases, fixed codes can be used, but some blocks of
. 27/43 motion prediction video can be excluded, thus obtaining shorter fixed codes due to the use of fewer motion prediction video blocks.
The prediction encoding unit 32 can compare the motion information of the current video block with the motion information of the motion prediction blocks in a set and select an index value for the current video, where the index value identifies a of the motion prediction video blocks. Based on the encoding mode, the motion information of the current video block can be generated using the index value to: inherit a motion vector from the identified motion prediction video block or calculate a motion vector by adding or subtracting residual motion vector information from the motion vector of the identified motion prediction video block. The exemplary method shown in Figures 4 and 5 can be based on a scenario in which the current video block and the motion prediction video block use a uni-predictive directional mode. However, the method of Figures 4 and 5 can also be extended to bi-predictive scenarios, where each video block has two motion vectors, | 25 considering the combined distance of the two blocks | predictive factors of prediction video blocks | movement coded in bi-predictive mode, with respect to | current video unit. In some examples, if anyone | of the motion prediction video blocks has the same '30 POC, then a predefined order can be used or other criteria can be used to order the motion prediction video blocks. In an example, the default order can be T block, followed by L block,
DRM i: 28/43 followed by the Temp block, followed by the TR block, followed by the BL block. For a set of five blocks, any of the 120 possible orders can be used as the predefined order. Other criteria that can be used to determine the order may include: reference list, reference index, prediction direction, block size, prediction unit size, prediction partition type, transform index, transform size or others information related to the video blocks. For example, the ordering can be based on whether the size or shape of the video block that is encoded is the size or shape of the motion prediction video blocks. If one or more blocks of motion prediction video cannot be ordered solely on a specified time characteristic (each motion prediction video block refers to the same predictive block, for example), a second criterion can be based on other techniques of ordering described here. In another technique described by this disclosure, a set of motion prediction video blocks can be organized in an ordered hierarchy based on the current video block's partition form.
According to the disclosure, the prediction encoding unit 32 can use other techniques to encode the motion information of the current Video block. Figure 6 is a conceptual diagram showing an example of possible video blocks that can be analyzed in order to determine how to encode motion information for the current video block. In Figure 6, the video blocks are located in the same frame as the current video block. In another example, candidate video blocks “can also be located in different frames (already encoded / decoded) from the current video block. For example, the block co-located with the
: 29/43 current video from one or more previously encoded frames can also be a candidate video block. The prediction encoding unit 32 can analyze the motion information of the candidate video blocks shown in Figure 6.
Figure 7 is a conceptual diagram showing an example of a method for analyzing video blocks based on criteria for encoding motion information for the current video block. Essentially, the example shown in Figure 7 shows an effective way to compare motion information from the current video block with video blocks that can be used to encode motion information from the current video block. The example described according to Figure 7 can be used to search for a set of motion prediction video blocks of different sizes.
In the example shown in Figure 7, there are eleven blocks of motion prediction video. Each of the video blocks includes a direction mode value (ie, single prediction or bi prediction), a reference list value and a reference index value. Before comparing the motion information for the current video block with the motion information for each of the eleven motion prediction video blocks shown in Figure 7, a first comparison of the direction mode value, the list value, can occur. reference value and the reference index value. This can cause less comparisons between movement information to occur. In this way, the prediction encoding unit 32 can effectively search for a motion prediction video block for the current video block. According to the example shown in Figure 7, the reference list value and the reference index value of a block of
'30/43 motion prediction video can be compared with the current video block. In the example shown in Figure 7, it can be determined whether the motion vector of a motion prediction video block predicts from the same reference list and the same reference index as the motion vector of the current video block. As shown in the example shown in Figure 7, assuming that the current video block is being encoded in bi-prediction mode, the two motion vectors used in this bi-prediction mode point to the reference list Ll1 and the reference index 0. A search can be made on the motion prediction video blocks to find video blocks encoded in the bi-prediction mode, and the two motion vectors used in this bi-prediction mode point to the reference list. Ll e for O benchmark O.
In the exemplary search method shown in Figure 7, the search starts from the left along the left search direction (from video blocks O to 4), if a match is found (in this example, candidate video block 2 is a correspondence), the left search can be interrupted and a top search is started from the candidate video blocks 5 to 10 along the top search direction. Once the first top video block candidate match is found (in this example, video block 6 is a match), the top search can be interrupted. The movement information of the current video block can be compared with that of video block 2 and with that of video block 6. This process can be repeated once the necessary unitary movement information of a predictive video block is within of a motion information limit for the current video block.
'31/43 It should be noted that, in the example shown in Figure 7, if the prediction direction was not taken into account, the set of motion prediction blocks may include the video block O (first match in the left search) and video block 6 (first match in the top search). Candidate video block O may ultimately not be useful for predicting the motion vector information of the current video block, since it is encoded in uni-prediction mode.
In line with the examples in this disclosure, additional criteria can be added to analyze candidate video blocks. In addition to the reference list, reference index and prediction direction, additional criteria may include one or more of block size, prediction unit size, prediction partition type, transform index, transform size or other information related to the video block.
The prediction encoding unit 32 can generate an index value to tell a decoder where to locate a motion prediction block (top, left or co-located, for example). A decoder can perform a corresponding search process to determine a motion prediction video block. In this way, a decoder can generate a motion vector for the current video block by searching for a subset of video blocks. With reference to Figure 7, an index value may indicate a subset of motion prediction video blocks (i.e., video blocks 0 through 4 or video blocks 5 through 10) of a known set. Using the index value, a decoder can compare information such as direction mode value, reference list value and reference index value of the motion prediction video blocks with
"32/43 the current video block.
If there is a "match", motion information for the current video block can be generated from the first predictive video block that produces a "match". In this way, the movement information for the current video block can be encoded using an index value that identifies that subset.
This can achieve significant bit savings when compared to the generation of an index value that identifies a predictive video block.
According to the disclosure, the prediction encoding unit 32 can encode the motion information of the current video block using an ordered hierarchy, where the ordered hierarchy is based on partition information.
Figure 8 is a conceptual diagram showing examples of video block partitions.
The prediction partition formations shown in Figure 8 are some examples of the prediction partition shape, which can be defined by a prediction unit (PU) and a PU index in accordance with the high-efficiency video encoding standard ( HEVC) emerging.
When the motion information of the current video block is encoded using a partition shape (which can be defined by a PU shape and an index that defines a PU size), the five candidate blocks can be sorted with probability higher for lower probability.
In this case, the probability corresponds to the probability that the motion vector of one of the five candidate video blocks “matches” the motion vector of the current video block.
The ordered sets can be programmed beforehand and stored in both an encoder and a decoder.
Figures 9A-9K are conceptual diagrams showing examples of creating an ordered hierarchy
: 33/43 based on a partition type for a set of motion prediction video blocks. In the examples shown in Figures 9A-9K, the set of motion prediction blocks includes the five blocks shown in Figure 1A. In other examples, the set of motion prediction video blocks may include more or less motion prediction video blocks. For example, a set can include three blocks of motion prediction video. Including fewer motion prediction video blocks in a set can reduce encoding complexity.
In the examples shown in Figures 9A-9K, the numbers within the motion prediction video blocks represent the hierarchical order of the motion prediction video blocks. For each form of prediction partition (which can be defined by a form of PU and an index that defines a size of PU), the ordering can be preprogrammed and stored in both the encoder and the decoder.
In Figure 9A, for example, a hierarchy of motion prediction video blocks when the partition form is 2Nx2N can be defined as: block I, block T, block BL, block TR, Temp.
In Figure 9B, for example, a hierarchy of motion prediction video blocks when the partition form is 2NxN O can be defined as: block T, block [I, block TR, block BL, Temp.
In Figure 9C, for example, a hierarchy of motion prediction video blocks when the partition form is Nx2N 1 can be defined as: L block, BL block, BL block, Temp block, TR block, T block.
In Figure 9D, for example, a hierarchy of motion prediction video blocks when the shape of
'34/43 partition is NxX2N O can be defined as: L block, T block, BL block, Temp, TR block.
In Figure 9E, for example, a hierarchy of motion prediction video blocks when the partition form is NX2N 1 can be defined as: T block, TR block, Temp, BL block, L block.
In Figure 9F, for example, a hierarchy of motion prediction video blocks when the partition form is NXN O can be defined as: L block, T block, BL block, TR block, Temp.
In Figure 9G, for example, a hierarchy of motion prediction video blocks when the partition form is NxN 2 can be defined as: L block, BL block, T block, TR block, Temp.
In Figure 9H, for example, a hierarchy of motion prediction video blocks when the partition form is NXN 1 can be defined as: T block, TR block, L block, Temp, BL block.
In Figure 91, for example, a hierarchy of motion prediction video blocks when the partition form is NxXN 3 can be defined as: L block, T block, Temp, TR block, BL block.
In Figure 9J, for example, a hierarchy of motion prediction video blocks when the partition form is NxX2N O can be defined as: TL block, T block, BL block, Temp, TR block, L block.
In Figure 9K, for example, a hierarchy of motion prediction video blocks when the partition form is Nx2N 1 can be defined as: T block, TR block, Temp, BL block, TL block, L block.
The prediction encoding unit 32 can encode the motion information of the current video block using the exemplary hierarchical order 35/43 shown in Figures 9A-9IT. The exemplary hierarchical ordering can be stored in memory 34. Figure 10 is an example of a flowchart showing a technique for encoding video data using the exemplary hierarchical ordering of Figures 9A-9IT. It should be noted that, although Figure 10 is described together with the video encoder 50, the steps described in Figure 10 can be performed by other devices and components. In step 250, the prediction encoding unit 32 obtains a motion vector for the current video block. As described above, a motion vector indicates a predictive video block that can be used to encode the current video block. In step 252, the prediction encoding unit 32 obtains a partition type for the current video block. The prediction coding unit 32 can receive a partition type value from the quaternary tree transformed partition unit
31. In one example, the partition type corresponds to one of the partition types described in Figures 9A-91I.
In step 254, the prediction encoding unit 32 selects one of a plurality of defined sets of candidate predictive video blocks ordered based on the type of partition. For example, if the partition type is Nx2N 1, the set of candidate candidate predictive video blocks can be defined as: T block, TR block, Temp, BL block, L block. In step 256, the prediction encoding unit 32 selects a predictive video block from the selected set of the plurality of defined sets of candidate predictive video blocks ordered based on the motion vector of the current video block. In step 256, the motion vector of the current video block can be compared with each of the motion vectors of the candidate video blocks within
"36/43 of the set. The comparison can be made in a similar way to the search described according to Figure 7, where the minimum number of comparisons between motion vectors is made and, if a motion vector is found within a limit of the motion vector of the current video block, the search is complete. In step 258, the prediction encoding unit 32 generates an index value that identifies the selected predictive video block. The index values for each of the blocks of prediction Predictive video can be stored in memory 34.
Variable-length code words can be used as index values for each of the motion prediction video blocks. The motion prediction video block which has the highest probability of being the highest rated motion prediction video block for a given current video block can be assigned the shortest codeword. By assigning variable length index values, bit savings can be achieved. A decoder can be programmed to know the same hierarchy and, therefore, it can properly interpret the received codeword to make the same selection used as an encoder. In one example, the highest rated predictive video block in each of the sets defined in Figures 9A-91I can be assigned an index value of one bit. In other examples, only a subset (the top 3 of 5, for example) can be considered in any given scenario, which can reduce the coding complexity. In this way, if several video blocks in a group are encoded using only a subset of the video blocks, the number of index values used to encode the group can also be reduced. In this case, a video encoder 50
"37/43 can signal a reduced set of index values for a group of encoded video blocks.
Examples of creating an ordered hierarchy of motion prediction video blocks based on the partition form described here can be used in conjunction with methods to generate motion information for the current video block.
For example, an encoder can generate the current video block using any of the following techniques: inherit a motion vector from the identified motion prediction video block, calculate a motion vector by adding or subtracting residual motion vector information of a motion vector from an identified motion prediction video block, or calculating a motion vector using the motion vector information from one or more high-rated motion prediction video blocks by selecting a motion vector median or by the proportional division of the motion vectors.
Figure 11 is a block diagram showing an example of video decoder 60, which decodes a video sequence that is encoded in the manner described herein.
The techniques of this disclosure can be performed by the video decoder 60 in some examples.
In particular, the video decoder 60 can perform one or more of the motion information determination techniques for the current video block described herein as part of the decoding process.
The video decoder 60 includes an entropy decoding unit 52, which performs the corresponding decoding function of the encoding performed by the entropy coding unit 46 of Figure 2. In particular, the entropy decoding unit 52 can perform CAVLC decoding or CABAC or any other type
Entropy decoding 38/43 used by video encoder 50. Video decoder 60 also includes a prediction decoding unit 54, a reverse quantization unit 56, a reverse transform unit 58, a memory 62 and an adder 64. In particular, like video encoder 50, video decoder 60 includes a prediction decoding unit 54 and a filter unit 57. Prediction decoding unit 54 of video decoder 60 may include a video decoder unit. motion compensation 86, which decodes inter-coded blocks and possibly includes one or more interpolation filters for interpolating sub-pixels in the motion compensation process. The prediction decoding unit 54 can support a plurality of modes 35, which includes one or more modes that support AMVP and / or one or more fusion modes. Filter unit 57 can filter the output of adder 64 and can receive entropy-decoded filter information to define the filter coefficients applied in loop filtering.
Upon receiving encoded video data, the entropy decoding unit 52 performs decoding corresponding to the encoding performed by the entropy coding unit 46 (of encoder 50 of Figure 2). In the decoder, the entropy decoding unit 52 parses the bit stream to determine LCUs and the corresponding partitioning associated with the LCUs. In some examples, an LCU or LCU CUs may define the encoding modes that have been used, and these encoding modes may include bi-predictive fusion mode.
Therefore, the entropy decoding unit 52 can output the syntax information to the prediction unit that identifies the bi-predictive fusion mode.
39/43 Figure 12 is an example of a flow chart showing a technique for decoding video data using the exemplary hierarchical ordering in Figures 9A-91T. It should be noted that, although Figure 12 is described together with video encoder 60, the steps described in Figure 12 can be performed by other devices and components. In step 350, the predictive decoding unit 54 obtains an index value for the current video block. As described above according to Figure 10, an index value indicates a predictive video block that can be used to generate a motion vector for the current video block. In step 352, the predictive decoding unit 54 obtains a partition type for the current video block. In one example, the partition type corresponds to one of the partition types described in Figures 9A-9I. In step 354, the predictive decoding unit 54 selects one of a plurality of defined sets of candidate predictive video blocks ordered based on the type of partition. For example, if the partition type is Nx2N 1, the set of candidate candidate predictive video blocks can be defined as: T block, TR block, Temp, BL block, L block.
In step 356, the predictive decoding unit 54 selects a predictive video block from the selected set of the plurality of defined sets of candidate predictive video blocks ordered based on the index value. In step 358, the predictive decoding unit 54 generates a motion vector. For example, predictive decoding unit 54 can generate a motion vector using any of the following techniques: inherit a motion vector from the identified motion prediction video block, calculate a motion vector by adding or subtracting vector information
. 40/43 residual motion of a motion vector from an identified motion prediction video block, or calculating a motion vector using motion vector information from one or more high-rated motion prediction video blocks by selection of a median motion vector or by the proportional division of the motion vectors. In the example described in Figure 12, decoder 60 can be programmed to know the partition shape hierarchies described in Figures 9A-9I and, therefore, it can properly interpret the received index value to make the same selection of prediction video blocks. as encoder 50. In one example, the highest rated predictive video block in each of the sets defined in Figures 9A-9I can be assigned an index value of one bit. In other examples, only a subset (the top 3 of 5, for example) can be considered in any given scenario, which can reduce the coding complexity. For an Nx2N 1 partition, for example, the set of candidate predictive video blocks ordered T block, TR block and Temp can be assigned the following index values: 1.01 and 00. In this way, if several video blocks are encoded using only the index values of the top 3 video blocks, more bit savings can be achieved. In this case, a video encoder 50 can signal the number of motion prediction video blocks for a group of encoded video blocks. This can be achieved in a similar way to the way in which the left set of blocks is signaled in the example described according to Figure 7.
It must be recognized that, depending on the example, certain acts or events of any of the techniques
'41/43 described here can be performed in a different sequence, added, merged or left out altogether (not all the acts or events described are necessary for the practice of the techniques). Furthermore, in certain examples, acts or events can be performed concurrently, such as processing with multiple execution flows, processing with interrupts or multiple processors, and not sequentially.
In one or more examples, the functions described can be implemented in hardware, software, firmware or any combination of them. If implemented in software, functions can be stored or transmitted using one or more instructions or codes in a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media, which correspond to tangible media, such as data storage media or communication media that include any media that facilitates the transfer of a computer program from one place to another, according to a communication protocol, for example. In this way, the computer-readable media can generally correspond to (1) tangible computer-readable storage media that are non-transitory or (2) a communication medium, such as a signal or carrier wave. The data storage means can be any available means that can be accessed by one or more computers or one or more processors to retrieve instructions, code and / or data structures for implementing the techniques described in this disclosure.
For example, and not by way of limitation, such computer-readable storage media can
'42/43 comprise RAM, ROM, EEPROM, CD-ROM or any other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory or any other means that can be used to store desired program codes under in the form of instructions or data structures that can be accessed by a computer. In addition, any connection is properly called a computer-readable medium. For example, if instructions are transmitted from a website, server or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL) or wireless technologies such as infrared, radio and microwave, then coaxial cable, fiber optic cable, twisted pair, DSL or wireless technologies such as infrared, radio and microwave are included in the media definition. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals or other transient means, but are instead directed to tangible non-transitory storage media. The term disc (disk and disc), as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disc and blu-ray disc, in which discs usually reproduce magnetically, while discs reproduce data optically with lasers. Combinations of them should also be included within the range of computer-readable media.
Instructions can be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application-specific integrated circuits (ASICsS), arrays
'43/43 field programmable logic (FPGAs) or other equivalent integrated or discrete logic circuits.
Accordingly, the term "processor" as used herein may refer to any of the foregoing structures or to any other structure suitable for implementing the techniques described herein.
In addition, in some respects, the functionality described here can be provided within dedicated software modules or hardware modules configured to encode and decode, or incorporated into a combined codec.
In addition, the techniques can be fully implemented in one or more 'circuits or logic elements.
The techniques of this disclosure can be performed on a wide variety of devices or devices, which include a wireless telephone device, an integrated circuit (IC) or a set of ICs (a set of chips, for example). Various components, modules or units are described to emphasize functional aspects of devices “configured to perform the revealed techniques, but do not necessarily require execution by different hardware units.
Instead, as described above, several units can be combined into one codec hardware unit or provided by a collection of interoperable hardware units, which include one or more processors described above, together with appropriate software and / or firmware.
Several examples have been described.
These and other examples are within the scope of the following claims.
权利要求:
Claims (15)
[1]
1. Method for decoding video data according to the fusion mode, the method comprising: obtaining (350) an index value for the current video block encoded in the fusion mode; generate a set of candidate predictive video blocks based on the temporal and spatial neighbors for the current video block; limit the set of candidate predictive blocks to a subset of the candidate predictive video blocks; select a predictive video block from the subset based on the index value; and generate (358) motion information for the current video block according to the fusion mode based on the motion information of the predictive video block, where generating the motion information for the current video block comprises inheriting motion information of the predictive video unit.
[2]
2. Method according to claim 1, in which the subset includes three candidate predictive video blocks selected from at least five possible video blocks.
[3]
A method according to claim 2, in which the set of candidate predictive video blocks includes: a left video block adjacent (L) to the current video block, an upper video block adjacent (T) to the current video block, an upper right video block adjacent (TR) to the current video block, a lower left video block adjacent (BL) to the current video block, and a temporal video block adjacent (Temp) to the current video block .
[4]
A method according to claim 1, in which inheriting motion information includes inheriting a motion vector and a frame reference index from the predictive video block.
[5]
5. Method according to claim 4, in which the set includes: an upper left video block adjacent to the current video block (TL), an upper video block adjacent to the current video block (T), a block upper right video block adjacent to the current video block (TR), a left video block adjacent to the current video block (L), a lower left video block adjacent to the current video block (BL), and a video block time adjacent to the current video block (Temp).
[6]
6. Device for decoding video data according to the fusion mode, the device comprising: mechanisms for obtaining an index value for the current video block; mechanisms to generate a set of candidate predictive video blocks based on the temporal and spatial neighbors for the current video block; mechanisms to limit the set of candidate predictive blocks to a subset of the candidate predictive video blocks; mechanisms for selecting a predictive video block from the subset based on the index value; and mechanisms for generating motion information for the current video block according to the fusion mode based on the motion information of the predictive video block, in which generating the movement information for the current video block comprises inheriting motion information from the predictive video block.
[7]
7. Method for encoding video data according to the fusion mode, the method comprising: obtaining (250) a motion vector for a current video block; generate a set of candidate predictive video blocks based on the temporal and spatial neighbors to the current video block; limit the set of candidate predictive video blocks to a subset of the candidate predictive video blocks; select a predictive video block from the subset based on the motion vector; and generating (258) an index value identifying the selected predictive video block.
[8]
8. Method according to claim 7, in which the subset includes three candidate predictive video blocks selected from at least five possible video blocks.
[9]
9. Method according to claim 8, in which at least five possible predictive video blocks include: a left video block adjacent (L) to the current video block, an upper video block adjacent (T) to the current video block, an upper right video block adjacent (TR) to the current video block, a lower left video block adjacent (BL) to the current video block, and a temporal video block adjacent (Temp) to the current video block .
[10]
10. Method according to claim 8, in which the set includes: an upper left video block adjacent (TL) to the current video block, an upper video block adjacent (T) to the current video block, a block upper right video block (TR) adjacent to the current video block, one left video block adjacent (L) to the current video block, one lower left video block adjacent (BL) to the current video block, and one video block adjacent (Temp) to the current video block.
[11]
11. Device for encoding video data according to the fusion mode, the device comprising: mechanisms for obtaining a motion vector for a current video block; mechanisms to generate a set of candidate predictive video blocks based on the temporal and spatial neighbors to the current video block; mechanisms to limit the set of candidate predictive video blocks to a subset of the candidate predictive video blocks; mechanisms for selecting a predictive video block from the subset based on the motion vector; and mechanisms for generating an index value identifying the selected predictive video block.
[12]
12. Device according to claim 11, in which the subset includes three candidate predictive video blocks selected from at least five possible video blocks.
[13]
13. Device according to claim 12, in which at least five possible predictive video blocks include: a left video block adjacent (L) to the current video block, an upper video block adjacent (T) to the current video block, an upper right video block adjacent (TR) to the current video block, a lower left video block adjacent (BL) to the current video block, and a temporal video block adjacent (Temp) to the current video block .
[14]
Device according to claim 12, in which the set includes: an upper left video block adjacent (TL) to the current video block, an upper video block adjacent (T) to the current video block, a block upper right video block (TR) to the current video block, a left video block adjacent (L) to the current video block, a lower left video block (BL) adjacent to the current video block, and a video block adjacent (Temp) to the current video block.
[15]
15. A computer program product comprising a computer-readable storage medium having instructions stored therein that, when executed, cause the processor to execute the method as defined in any of claims 1a50u7alo.
类似技术:
公开号 | 公开日 | 专利标题
US10659791B2|2020-05-19|Hierarchy of motion prediction video blocks
JP6254136B2|2017-12-27|Integrated merge mode and adaptive motion vector prediction mode candidate selection
JP5960309B2|2016-08-02|Video coding using mapped transform and scan mode
JP5778299B2|2015-09-16|Motion vector prediction
US10652571B2|2020-05-12|Advanced motion vector prediction speedups for video coding
JP5797840B2|2015-10-21|Parallelization-friendly merge candidates for video coding
BR112013024187B1|2022-02-01|Bi-predictive merging mode based on uni-predictive neighbors in video encoding
JP6426184B2|2018-11-21|Scalable implementation for parallel motion estimation domain
WO2013112729A2|2013-08-01|Video coding using parallel motion estimation
BR112020006588A2|2020-10-06|affine prediction in video encoding
WO2019052330A1|2019-03-21|Encoding and decoding method and apparatus for motion information
JP5937205B2|2016-06-22|Run mode based coefficient coding for video coding
TW202021354A|2020-06-01|Motion vector predictor list generation
JP2022500910A|2022-01-04|Decoding method and decoding device for predicting motion information
WO2020024275A1|2020-02-06|Inter-frame prediction method and device
KR20220024121A|2022-03-03|Combined inter and intra prediction modes for video coding
US20210076036A1|2021-03-11|Encoding method and apparatus, and decoding method and apparatus
CN113170141A|2021-07-23|Inter-frame prediction method and related device
同族专利:
公开号 | 公开日
IL227925A|2016-04-21|
US20190166371A1|2019-05-30|
WO2012116212A1|2012-08-30|
AU2012220567A1|2013-09-12|
EP2679011A1|2014-01-01|
CN103404143B|2017-02-15|
US10659791B2|2020-05-19|
US10171813B2|2019-01-01|
SG192739A1|2013-09-30|
JP2014511618A|2014-05-15|
US20120219064A1|2012-08-30|
KR20130129445A|2013-11-28|
KR20160124242A|2016-10-26|
CN103404143A|2013-11-20|
MY169658A|2019-04-26|
RU2013143126A|2015-03-27|
CA2828217C|2016-09-20|
IL227925D0|2013-09-30|
RU2562379C2|2015-09-10|
AU2012220567B2|2015-12-24|
KR101918318B1|2018-11-13|
CA2828217A1|2012-08-30|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US6522693B1|2000-02-23|2003-02-18|International Business Machines Corporation|System and method for reencoding segments of buffer constrained video streams|
WO2001072080A1|2000-03-23|2001-09-27|Siemens Information And Communication Networks S.P.A.|Access channel scheduling in a radio communication system|
FR2831016A1|2001-10-11|2003-04-18|Thomson Licensing Sa|METHOD AND DEVICE FOR DECODING ENCODED VIDEO DATA ACCORDING TO THE MPEG STANDARD|
JP2003299103A|2002-03-29|2003-10-17|Toshiba Corp|Moving picture encoding and decoding processes and devices thereof|
HUE044616T2|2002-04-19|2019-11-28|Panasonic Ip Corp America|Motion vector calculating method|
WO2003098939A1|2002-05-22|2003-11-27|Matsushita Electric Industrial Co., Ltd.|Moving image encoding method, moving image decoding method, and data recording medium|
KR100506864B1|2002-10-04|2005-08-05|엘지전자 주식회사|Method of determining motion vector|
US8085846B2|2004-08-24|2011-12-27|Thomson Licensing|Method and apparatus for decoding hybrid intra-inter coded blocks|
KR20050041293A|2003-10-30|2005-05-04|삼성전자주식회사|Apparatus and method for estimating motion|
US8036271B2|2004-02-24|2011-10-11|Lsi Corporation|Method and apparatus for determining a second picture for temporal direct-mode block prediction|
US8446954B2|2005-09-27|2013-05-21|Qualcomm Incorporated|Mode selection techniques for multimedia coding|
KR20080025246A|2006-09-15|2008-03-20|삼성전자주식회사|Method for video recording by parsing video stream by gop and video apparatus thereof|
WO2008082762A1|2006-12-29|2008-07-10|Motorola, Inc.|Method and system for processing encoded video data|
KR101383540B1|2007-01-03|2014-04-09|삼성전자주식회사|Method of estimating motion vector using multiple motion vector predictors, apparatus, encoder, decoder and decoding method|
US8335261B2|2007-01-08|2012-12-18|Qualcomm Incorporated|Variable length coding techniques for coded block patterns|
US8199812B2|2007-01-09|2012-06-12|Qualcomm Incorporated|Adaptive upsampling for scalable video coding|
US20100215101A1|2007-04-09|2010-08-26|Yong Joon Jeon|Method and an apparatus for processing a video signal|
BRPI0809512A2|2007-04-12|2016-03-15|Thomson Licensing|context-dependent merge method and apparatus for direct jump modes for video encoding and decoding|
US8254245B2|2007-04-27|2012-08-28|Lg Electronics Inc.|Method for transmitting downlink control channel in a mobile communications system and a method for mapping the control channel to physical resource using block interleaver in a mobile communications system|
JP4325708B2|2007-07-05|2009-09-02|ソニー株式会社|Data processing device, data processing method and data processing program, encoding device, encoding method and encoding program, and decoding device, decoding method and decoding program|
US8605786B2|2007-09-04|2013-12-10|The Regents Of The University Of California|Hierarchical motion vector processing method, software and devices|
CA2701877A1|2007-10-15|2009-04-23|Nokia Corporation|Motion skip and single-loop encoding for multi-view video content|
US8908765B2|2007-11-15|2014-12-09|General Instrument Corporation|Method and apparatus for performing motion estimation|
US8660175B2|2007-12-10|2014-02-25|Qualcomm Incorporated|Selective display of interpolated or extrapolated video units|
GB2457546B|2008-02-25|2012-09-19|A1 Envirosciences Ltd|Laboratory containment system|
EP2266318B1|2008-03-19|2020-04-22|Nokia Technologies Oy|Combined motion vector and reference index prediction for video coding|
JP4990927B2|2008-03-28|2012-08-01|三星電子株式会社|Method and apparatus for encoding / decoding motion vector information|
WO2010021700A1|2008-08-19|2010-02-25|Thomson Licensing|A propagation map|
US8634456B2|2008-10-03|2014-01-21|Qualcomm Incorporated|Video coding with large macroblocks|
CN102498718B|2009-07-03|2016-01-20|法国电信公司|There is the prediction of the motion-vector of the present image subregion of the geometry different from the geometry of at least one adjacent reference picture subregion or size or size and use the Code And Decode of a this prediction|
KR20110008653A|2009-07-20|2011-01-27|삼성전자주식회사|Method and apparatus for predicting motion vector and method and apparatus of encoding/decoding a picture using the same|
US9060176B2|2009-10-01|2015-06-16|Ntt Docomo, Inc.|Motion vector prediction in video coding|
KR20110050283A|2009-11-06|2011-05-13|삼성전자주식회사|Fast motion estimation method using multi-reference frames|
JP5368631B2|2010-04-08|2013-12-18|株式会社東芝|Image encoding method, apparatus, and program|
HUE043816T2|2010-05-04|2019-09-30|Lg Electronics Inc|Method and apparatus for processing a video signal|
DK2858366T3|2010-07-09|2017-02-13|Samsung Electronics Co Ltd|Method of decoding video using block merge|
KR101484281B1|2010-07-09|2015-01-21|삼성전자주식회사|Method and apparatus for video encoding using block merging, method and apparatus for video decoding using block merging|
PL2613535T3|2010-09-02|2021-10-25|Lg Electronics Inc.|Method for encoding and decoding video|
WO2012097376A1|2011-01-14|2012-07-19|General Instrument Corporation|Spatial block merge mode|
US10171813B2|2011-02-24|2019-01-01|Qualcomm Incorporated|Hierarchy of motion prediction video blocks|KR101791078B1|2010-04-16|2017-10-30|에스케이텔레콤 주식회사|Video Coding and Decoding Method and Apparatus|
US10171813B2|2011-02-24|2019-01-01|Qualcomm Incorporated|Hierarchy of motion prediction video blocks|
JP5982734B2|2011-03-11|2016-08-31|ソニー株式会社|Image processing apparatus and method|
EP2687015A4|2011-03-14|2014-12-17|Mediatek Inc|Method and apparatus for deriving temporal motion vector prediction|
EP2687014B1|2011-03-14|2021-03-10|HFI Innovation Inc.|Method and apparatus for derivation of motion vector candidate and motion vector prediction candidate|
US8755437B2|2011-03-17|2014-06-17|Mediatek Inc.|Method and apparatus for derivation of spatial motion vector candidate and motion vector prediction candidate|
CN107948657B|2011-03-21|2021-05-04|Lg 电子株式会社|Method of selecting motion vector predictor and apparatus using the same|
US9247266B2|2011-04-18|2016-01-26|Texas Instruments Incorporated|Temporal motion data candidate derivation in video coding|
JP2013005077A|2011-06-14|2013-01-07|Sony Corp|Image processing device and method|
KR20130050406A|2011-11-07|2013-05-16|오수미|Method for generating prediction block in inter prediction mode|
KR20130050149A|2011-11-07|2013-05-15|오수미|Method for generating prediction block in inter prediction mode|
JP5561348B2|2011-12-16|2014-07-30|株式会社Jvcケンウッド|Moving picture decoding apparatus, moving picture decoding method, moving picture decoding program, receiving apparatus, receiving method, and receiving program|
JP5900308B2|2011-12-16|2016-04-06|株式会社Jvcケンウッド|Moving picture coding apparatus, moving picture coding method, and moving picture coding program|
EP3703370B1|2011-12-16|2021-08-04|JVCKENWOOD Corporation|Dynamic image encoding device, dynamic image encoding method, dynamic image encoding program, dynamic image decoding device, dynamic image decoding method, and dynamic image decoding program|
WO2013099283A1|2011-12-28|2013-07-04|株式会社Jvcケンウッド|Video encoding device, video encoding method and video encoding program, and video decoding device, video decoding method and video decoding program|
WO2014015807A1|2012-07-27|2014-01-30|Mediatek Inc.|Method of constrain disparity vector derivation in 3d video coding|
JP2014082639A|2012-10-16|2014-05-08|Canon Inc|Image encoder and method of the same|
US9350970B2|2012-12-14|2016-05-24|Qualcomm Incorporated|Disparity vector derivation|
US8976220B2|2013-07-05|2015-03-10|Sorenson Communications, Inc.|Devices and methods for hosting a video call between a plurality of endpoints|
US10609407B2|2013-07-09|2020-03-31|Nokia Technologies Oy|Method and apparatus for video coding|
US9479788B2|2014-03-17|2016-10-25|Qualcomm Incorporated|Systems and methods for low complexity encoding and background detection|
WO2017008255A1|2015-07-14|2017-01-19|Mediatek Singapore Pte. Ltd.|Advanced intra prediction mode signaling in video coding|
JP6036974B2|2015-12-08|2016-11-30|株式会社Jvcケンウッド|Image encoding device, image encoding method, image encoding program, transmission device, transmission method, and transmission program|
JP6036977B2|2015-12-08|2016-11-30|株式会社Jvcケンウッド|Image encoding device, image encoding method, image encoding program, transmission device, transmission method, and transmission program|
JP6036976B2|2015-12-08|2016-11-30|株式会社Jvcケンウッド|Image encoding device, image encoding method, image encoding program, transmission device, transmission method, and transmission program|
JP6036975B2|2015-12-08|2016-11-30|株式会社Jvcケンウッド|Image encoding device, image encoding method, image encoding program, transmission device, transmission method, and transmission program|
US10602174B2|2016-08-04|2020-03-24|Intel Corporation|Lossless pixel compression for random video memory access|
JP6288209B2|2016-10-24|2018-03-07|株式会社Jvcケンウッド|Image encoding device|
US20180184107A1|2016-12-28|2018-06-28|Novatek Microelectronics Corp.|Motion estimation method and motion estimation apparatus|
CN106878728B|2017-01-19|2019-06-07|西安万像电子科技有限公司|The compression method and device of image|
EP3399754A1|2017-05-04|2018-11-07|Thomson Licensing|Method and apparatus for most probable modereordering for intra prediction|
US10291925B2|2017-07-28|2019-05-14|Intel Corporation|Techniques for hardware video encoding|
WO2019195829A1|2018-04-06|2019-10-10|Arris Enterprises Llc|Reducing motion vector information transmission in bi-directional temporal prediction|
GB2588528A|2018-06-29|2021-04-28|Beijing Bytedance Network Tech Co Ltd|Selection of coded motion information for LUT updating|
CN110662064A|2018-06-29|2020-01-07|北京字节跳动网络技术有限公司|Checking order of motion candidates in LUT|
CN110662072A|2018-06-29|2020-01-07|杭州海康威视数字技术股份有限公司|Motion information candidate list construction method and device and readable storage medium|
US11032574B2|2018-12-31|2021-06-08|Tencent America LLC|Method and apparatus for video coding|
US11025913B2|2019-03-01|2021-06-01|Intel Corporation|Encoding video using palette prediction and intra-block copy|
US10855983B2|2019-06-13|2020-12-01|Intel Corporation|Encoding video using two-stage intra search|
法律状态:
2020-12-15| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|
2020-12-15| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2020-12-15| B15K| Others concerning applications: alteration of classification|Free format text: AS CLASSIFICACOES ANTERIORES ERAM: H04N 7/26 , H04N 7/36 Ipc: H04N 19/105 (2014.01), H04N 19/157 (2014.01), H04N |
2021-12-07| B350| Update of information on the portal [chapter 15.35 patent gazette]|
优先权:
申请号 | 申请日 | 专利标题
US201161446392P| true| 2011-02-24|2011-02-24|
US61/446,392|2011-02-24|
US201161447017P| true| 2011-02-26|2011-02-26|
US61/447,017|2011-02-26|
US201161451493P| true| 2011-03-10|2011-03-10|
US61/451,493|2011-03-10|
US201161529110P| true| 2011-08-30|2011-08-30|
US61/529,110|2011-08-30|
US201161531526P| true| 2011-09-06|2011-09-06|
US201161531514P| true| 2011-09-06|2011-09-06|
US61/531,514|2011-09-06|
US61/531,526|2011-09-06|
US13/402,719|US10171813B2|2011-02-24|2012-02-22|Hierarchy of motion prediction video blocks|
US13/402,719|2012-02-22|
PCT/US2012/026368|WO2012116212A1|2011-02-24|2012-02-23|Hierarchy of motion prediction video blocks|
[返回顶部]