专利摘要:
sample region fusion. the present invention relates to, according to an embodiment, a favorable merger or grouping of simply linked regions in which the sample set of information is subdivided, is encoded with a reduced number of data. for this purpose, for simply linked regions, a predetermined relative location relationship is defined allowing an identification for a simply linked predetermined region of simply linked regions within the plurality of simply linked regions that have a predetermined relative relationship predetermined location for the simply connected region. that is, if the number is equal to zero, a fusion indicator for the predetermined region simply linked may be missing from the data stream. according to yet other embodiments, a spatial subdivision of a sample area that represents a spatial sampling of the two-dimensional information signal in a plurality of simply linked regions of different sizes by recursive multi-partition is performed according to a first subset of syntax elements contained in the data stream, followed by a combination of spatially linked regions simply linked, depending on a second subset, to obtain an intermediate subdivision of the sample set into dissociated sets of simply linked regions, the union of which is plurality of regions simply connected. the intermediate subdivision is used when reconstructing the sample set from the data stream.
公开号:BR112012026393A2
申请号:R112012026393
申请日:2011-04-13
公开日:2020-04-14
发明作者:Marpe Detlev;Winken Martin;Helle Philipp;Oudin Simon;Wiegand Thomas
申请人:Fraunhofer Ges Forschung;
IPC主号:
专利说明:

Invention Patent Descriptive Report for: “SAMPLE REGION MERGER”. DESCRIPTION
The present invention relates to encoding schemes for two-dimensionally sampled information signals, such as videos or still images.
In image and video encoding, the images or particular sets of sample arrangements for the images are generally broken down into blocks, which are associated with certain encoding parameters. The images usually consist of multiple sample arrangements. In addition, an image can also be associated with additional auxiliary sample arrangements, which can, for example, specify the transparency of the information or depth maps. Sample arrangements for an image (including auxiliary sample arrangements) can be grouped into one or more so-called plan groups, where each plan group consists of one or more sample arrangements. The plane groups of an image can be coded independently, or, if the image is associated with more than one plane group, with the prediction of other plane groups of the same image. Each plan group is usually broken down into blocks. The blocks (or the corresponding blocks of sample arrangements), are predicted by an inter-image prediction or intra-image prediction. The blocks can have different sizes, and can be either square or rectangular. The partitioning of a block image can either be fixed by syntax, or it can be (at least partially) signaled within the bit stream. Often, syntax elements are transmitted in a way that signals the subdivision for blocks of predefined sizes. Such syntax elements can specify whether and how a block is subdivided into smaller blocks and associated encoding parameters, for example, for prediction purposes. For all samples in a block (or the corresponding sample arrangement blocks), the decoding of the associated encoding parameters is specified in a certain way. In the example, all samples in a block are predicted using the same set of prediction parameters, such as reference indices (identifying a reference image in the set of already encoded images), motion parameters (specifying a measurement for the block movement between a reference image and the current image), parameters for specifying filter interpolation, intra-prediction modes, etc. The motion parameters can be represented by displacement vectors with a horizontal and vertical component, or by means of higher order motion parameters, such as the related motion parameters consisting of six components. It is also possible that more than one set of particular prediction parameters (such as benchmarks and motion parameters) are associated with a single block. In this case, for each set of these determined prediction parameters, a unique prediction signal
2/77 intermediate for the block (or the corresponding sample arrangement blocks) is generated, and the final prediction signal is constructed by a combination including overlapping of the intermediate prediction signals. The corresponding weight parameters and possibly also a displacement constant (which is added to the weight sum) can be fixed to an image, or a reference image, or a set of reference images, or can be included in the parameter set prediction for the corresponding block. The difference between the original blocks (or the corresponding sample arrangement blocks) and their prediction signals, also referred to as the residual signal, is generally transformed and quantified. Often, a two-dimensional transformation is applied to the residual signal (or to the corresponding sample arrangements of the residual block). For transformation coding, the blocks (or the corresponding sample arrangement blocks), for which a certain set of prediction parameters have been used, can still be divided, before applying the transformation. The transformation blocks can be the same or smaller than the blocks that are used for prediction. It is also possible for a transformation block to include more than one of the blocks that are used for the prediction. Different transformation blocks can have different sizes and the transformation blocks can represent square or rectangular blocks. After transformation, the resulting transformation coefficients are quantified and the so-called levels of transformation coefficients are obtained. The levels of transformation coefficients, as well as the prediction parameters, and, if present, the subdivision information are encoded by entropy.
In image and video coding standards, the possibilities of subdividing an image (or a plane group) into blocks that are provided by the syntax are very limited. Usually, it can only be specified if, and (potentially as) a block of a predefined size can be subdivided into smaller blocks. As an example, the size of the largest H.264 block is 16x16. 16x16 blocks are also referred to as macroblocks and each image is divided into macroblocks in a first step. For each 16x16 macroblock, it can be flagged if it is coded as a 16x16 block, or as two 16x8 blocks, or as two 8x16 blocks, or as four 8x8 blocks. If a 16x16 block is subdivided into four 8x8 blocks, each of these 8x8 blocks can be coded either as an 8x8 block, or as two 8x4 blocks, or as two 4x8 blocks, or as four 4x4 blocks. The set of reduced possibilities for specifying block partitioning in state-of-the-art image and video coding standards has the advantage that the lateral information rate for informing the subdivision signaling can be kept small, but has the disadvantage that the bit rate required to transmit the prediction parameters to the blocks can become significant, as explained below. The side information rate for signaling
3/77 of the prediction information generally does not represent a significant amount of a block's total bit rate. And the coding efficiency can be increased when said side information is reduced, which, for example, can be achieved through larger block sizes. Actual images or images in a video sequence consist of objects in an arbitrary manner with specific properties. As an example, objects or parts of objects are characterized by a unique texture or a unique movement. And, generally, the same set of prediction parameters can be applied to such an object, or part of an object. But the object boundaries generally do not match the possible block boundaries for large prediction blocks (for example, 16x16 macroblocks in H.264). An encoder generally determines the subdivision (among the limited set of possibilities) that results in at least one measure of cost in particular rate of distortion. For arbitrarily shaped objects, this can result in a large number of small blocks. And since each of these small blocks is associated with a set of prediction parameters, which need to be transmitted, the side information rate can become a significant part of the total bit rate. However, since many of the small blocks still represent areas of the same object or part of an object, the prediction parameters for a number of blocks obtained are the same or very similar.
That is, the subdivision or organization of an image into smaller portions or tiles or blocks substantially influences the coding efficiency and coding complexity. As described above, subdividing an image into a larger number of smaller blocks allows for a spatially finer adjustment of the encoding parameters, in which it allows for a better adaptability of the encoding parameters for the image / video material. On the other hand, defining the encoding parameters in a finer granularity represents a greater burden on the amount of lateral information needed, in order to inform the decoder about the necessary settings. In addition, it should be noted that any freedom for the encoder to (subsequently) subdivide the image / video into blocks spatially greatly increases the number of possible encoding parameter settings and thus, in general, makes the search for the encoding which leads to an even more difficult rate / distortion compromise.
It is an objective to provide a coding scheme for coding an array of samples that represent a spatially sampled two-dimensional information signal, such as, but not limited to, images from a video or still images, which allows obtaining a better compromise between coding complexity and proportion of rate distortion achievable, and / or to achieve a better rate of distortion rate.
4/77
This objective is achieved by a decoder according to claim 1, 8 or an encoder according to the methods of claim 20, 22, 23 or 24 according to any one of claims 19, 20, 25 and 26, a program of computer according to claim 27 and a data stream according to claim 28 or 30.
According to an embodiment, a favorable merger or grouping of simply linked regions into which the array of information samples is subdivided is coded with a reduced number of data. For this purpose, for simply linked regions, a relative relationship of predetermined and defined location allowing an identification for a simply linked predetermined region of simply linked regions within the plurality of simply linked regions that have a predetermined relative relationship the predetermined location for the re gion by simply connected. That is, if the number is equal to zero, a fusion indicator for the simply linked predetermined region may be missing from the data stream. In addition, if the number of regions that have the simply linked predetermined location relationship in relation to the simply linked predetermined region is one, the encoding parameters of the simply linked region can be approved or can be used to predict the parameters code for the predetermined region simply linked without the need for any additional syntax element. Otherwise, that is, if the number of regions that have the predetermined location relationship simply linked to the predetermined regions simply linked is greater than one, the introduction of an additional syntax element can be suppressed, even if the encoding parameters associated with these identified regions simply linked are identical to each other.
According to an embodiment, if the encoding parameters of the 25 simply linked neighboring regions are unequal to each other, a neighboring reference identifier can identify an appropriate subset of the number of regions simply linked with the predetermined relation to the location of the simply linked predetermined region and this appropriate subset is used when adopting the coding parameters or prediction of the coding parameters of the simply linked predetermined region.
In accordance with yet other embodiments, a spatial subdivision of a sample area that represents a spatial sampling of the two-dimensional information signal in a plurality of simply linked regions of different sizes by recursive multi-partition is performed according to a first subset of 35 elements of syntax contained in the data stream, followed by a combination of spatially neighboring regions simply linked, depending on a second subset of syntax elements within the data stream being decoupled from the first
5/77 sub-set, to obtain an intermediate subdivision of the sample arrangement into disjunct sets of simply connected regions, whose union is the plurality of simply connected regions. The intermediate subdivision is used when reconstructing the array of samples from the data stream. This makes it possible to make the optimization in relation to the subdivision less critical due to the fact that a very thin subdivision can be compensated for by the merger later. In addition, the combination of subdivision and fusion allows intermediate subdivisions to be achieved that would not be possible through recursive multiparking only so that the concatenation of the subdivision and the merger using disjoint sets of syntax elements allows for a better adaptation of the intermediate subdivision. the current content of the two-dimensional information signal. Compared to the advantages, the additional overhead resulting from the additional subset of syntax elements to indicate the details of merging, is negligible.
According to an embodiment, the set of information samples that represent the spatially sampled information signal is spaced in root regions of the tree, so first with subdivision, according to the multitree subdivision of information extracted from a data stream, at least a subset of tree root regions in smaller regions simply linked in different sizes by recursive multi-partitioning of the subset of tree root regions. In order to find a good compromise between a very fine subdivision and a very thick subdivision of the rate-sense of distortion, the reasonable coding complexity, the size of the maximum region of the tree root regions in which the arrangement of samples of the information it is spatially divided, is included within the data stream, and is extracted from the decoding side data stream. Consequently, a decoder may comprise an extraction device for extracting a maximum region size and multi-tree subdivision information from a data stream, a subdivisor configured to spatially divide an array of information samples representing an information signal. spatially sampled in deep tree regions of maximum size and subdivision region, according to information from the multi-tree subdivision, at least a subset of tree root regions in smaller regions simply linked in different sizes by recursive multi-partitioning of the root regions subset of the tree, and a reconstuctor configured to reconstruct the array of samples from the data information flow using subdivision into simply connected smaller regions.
According to an embodiment, the data flow also contains the maximum level of hierarchy up to which the subset of regions of the tree root is submitted to recursive multi-partitioning. By this measure, the signaling of multi-tree subdivision information becomes easier and more accurate with fewer bits for encoding.
6/77
In addition, the reconstructor can be configured to perform one or more of the following measures to a granularity that depends on the intermediate subdivision: deciding which prediction mode to use between, at least, the intra and inter prediction modes; transformation of the spectral to the spatial domain, realization and / or definition of parameters for an inter-prediction; realization and / or definition of parameters for an intra-prediction.
In addition, the extractor can be configured to extract the syntax elements associated with the leaf regions of the partitioned tree blocks in an order of passing depth from the data stream. By this measure, the extractor is able to explore the statistics of syntax elements already encoded from neighboring leaf regions with a higher probability of using a first amplitude transversal order.
According to another embodiment, another subdivisor is used to subdivide, according to yet another information from the multi-tree subdivision, at least a subset of smaller regions simply linked into smaller regions, even simply linked. The first subdivision phase can be used by the reconstructor to perform the prediction of the sample information area, while the second subdivision phase can be used by the reconstructor to perform the retransformation of the spectral to the spatial domain. Defining the residual subdivision to be relatively subordinate to the prediction of the subdivision makes the coding of the total subdivision a little less time consuming, and on the other hand, the restriction and freedom for the residual subdivision resulting from the subordination have merely minor negative effects on the coding efficiency since most of its parts of images with similar motion compensation parameters are larger than similar parts with spectral properties.
According to yet another embodiment, another region of maximum size is contained in the data stream, the other maximum size of the region defining the size of the sub-regions of the tree root into which the smaller regions simply connected are first divided before subdividing at least a subset of the sub-tree root according to other multi-tree subdivision information into simply linked smaller regions. This, in turn, allows for an independent adjustment of the maximum dimensions of the prediction region of the subdivision on the one hand and the residual subdivision on the other hand, and therefore allows to find a better rate / distortion compromise.
According to yet another embodiment of the present invention, the data flow comprises a first subset of syntax elements disjunct from a second subset of syntax elements that form the multi-tree subdivision information, wherein a concentration on the side of decoding system is capable of combining,
7/77 according to the first subset of syntax elements, small spatially connected neighboring regions of the multi-tree subdivision to obtain an intermediate subdivision of the sample array. The reconstructor can be configured to reconstruct the sample arrangement using the intermediate subdivision. By this measure, it is easier for the encoder to adapt the effective subdivision for the spatial distribution of the properties of the sample arrangement of the information to find an optimal speed / distortion compromise. For example, if the size of the maximum region is large, the information in the multi-tree subdivision is likely to become more complex due to the tree root regions becoming larger. On the other hand, however, if the size of the maximum region is small, it becomes more likely that regions neighboring tree roots belong to the information content with similar properties, so that these regions of tree roots may also have processed together. The merger fills this gap between the extremes mentioned above, which allows an almost ideal subdivision of granularity. From the coder's point of view, the syntax elements that merge to allow for a more relaxed or computationally less complex coding procedure, since the coder wrongly uses a very thin subdivision, this error can be compensated for later by the coder by configuring the junction of syntax elements with or without adaptation of only a small part of the syntax elements having been fixed before defining the merging syntax elements.
According to yet another embodiment, the size of the maximum region and the information of the multi-tree subdivision is used for the residual subdivision, instead of the prediction of the subdivision.
A first transversal order of depth for the treatment of the simply connected regions of a quadtree subdivision of an array of samples that represent an information signal of the spatially sampled information is used according to an embodiment, instead of a first transversal order of width. Using the transversal order of depth, each simply linked region is more likely to have simply linked neighboring regions that have already been traversed so that information about these simply linked neighboring regions can be positively exploited for the reconstruction of the respective current region simply connected.
When the array of information samples is first divided into a regular array of root regions of the zero order hierarchy size tree then it subdivides at least a subset of regions of the tree root into smaller, simply linked regions of different sizes, the reconstructor can use a zigzag sweep to check the root regions of trees where, for each
8/77 the root region of the tree to be partitioned, the leaf regions simply connected in transversal order of depth are treated before entering the next root region of the tree in the zigzag sweep order. In addition, the leaf regions according to the cross-sectional depth order simply linked at the same hierarchy level can also be traversed by a zigzag scanning order. Thus, the increased likelihood of having neighboring leaf regions simply linked is maintained.
According to an embodiment, although the flags associated with the nodes of the multi-tree structure are sequentially arranged in a transversal order of depth, the sequential encoding of the flags uses contexts of probability estimation that are the same for the flags associated with the nodes of the multi-tree structure located at the same hierarchy level as the multi-tree structure, but different for nodes of the multi-tree structure located within different hierarchy levels of the multi-tree structure, thus allowing a good compromise between the number of contexts to be provided and the adaptation to the current symbol statistics of the flags on the other hand.
According to an embodiment, the estimation probability contexts for a predetermined flag used also depends on the flags that precede the predetermined flag according to the transversal order of depth and corresponding to areas of the root region of the tree which has a predetermined location relation in relation to the area to which the predetermined flag corresponds. Similar to the idea underlying the process aspect, the use of the transversal order of depth guarantees a high probability that flags that have already been coded also comprise the flags corresponding to the neighboring areas of the area corresponding to the predetermined flag so that this knowledge can be used to better adapt the context to be used for the predetermined flag.
The flags that can be used to establish the context of a predetermined flag may be those corresponding to the rest areas for the top and / or the left of the area to which the predetermined flag corresponds. In addition, flags used for context selection can be restricted to flags belonging to the same hierarchy level as the node with which the predetermined flag is associated.
According to an embodiment, the coded signaling comprises an indication of a higher level of hierarchy and a sequence of flags associated with the nodes of the multi-tree structure of inequality for the highest level of hierarchy, each flag indicating whether the node associated is an intermediate node or node
Descending 9/77, and a sequential decoding, in a first transversal order of depth or width, of the sequence of data flow flags occurs with nodes interspersed at the highest hierarchy level and automatically indicating the same leaf nodes, thus reducing the encoding rate.
According to another embodiment, the coded signaling of the multi-tree structure may comprise the indication of the highest hierarchy level. By this measure, it is possible to restrict the existence of flags of different hierarchy levels of higher hierarchy level as other block partitioning of the highest hierarchy level is excluded in any way.
In the case of the spatial multi-tree subdivision being part of a secondary subdivision of leaf nodes and unpartitioned tree root regions of a primary multi-tree subdivision, the context used to encode the flags of the secondary subdivision can be selected so as to that the context are the same for flags associated with areas of the same size.
Preferred embodiments of the present invention are described below with respect to the following figures, among which:
Fig. 1 shows a block diagram of an encoder according to an embodiment of the present application;
Fig. 2 shows a block diagram of a decoder according to an embodiment of the present application;
Figs. 3a-c show schematically an illustrative example of a quadtree subdivision, in which Fig. 3a shows a first level of hierarchy, Fig. 3b shows a second level of hierarchy and Fig. 3c shows a third level of hierarchy;
Fig. 4 schematically shows a tree structure for the illustrative quadtree subdivision of Figs. 3a to 3c according to an embodiment;
Figs. 5a, b schematically illustrate the quadtree subdivision of Figs. 3a to 3c and the tree structure with indexes that index the individual leaf blocks;
Figs. 6a, b schematically show binary chains or sequences of flags representing the tree structure of the figure. 4 and the quadtree subdivision of fig. 3a to 3c, respectively, according to different embodiments;
Fig. 7 shows a flowchart, which shows the steps performed by an extractor data flow according to an embodiment;
Fig. 8 shows a flow chart illustrating the functionality of an extractor data flow according to another embodiment;
Fig. 9a, b show schematic diagrams illustrating subdivisions of quadtree with neighboring candidate blocks for a predetermined block to be detached, according to an embodiment;
10/77
Fig. 10 shows a flowchart of a feature of an extractor data flow according to another embodiment;
Fig. 11 schematically shows a composition of an image of planes and plan groups and illustrates a coding using inter-plan adaptation / prediction according to an embodiment;
Fig. 12a and 12b schematically illustrate a sub-tree structure and the corresponding subdivision, in order to illustrate the inheritance scheme according to an embodiment;
Fig. 12c and 12d schematically illustrate a sub-tree structure, in order to illustrate the inheritance scheme with approval and prediction, respectively, according to realization modalities;
Fig. 13 shows a flowchart showing the steps performed by an encoder for carrying out an inheritance scheme according to an embodiment;
Fig. 14a and 14b show a primary subdivision and a subordinate subdivision to illustrate a possibility of implementing an inheritance scheme in connection with interpretation according to an embodiment;
Fig. 15 shows a block diagram illustrating a decoding process, in connection with the inheritance scheme according to an embodiment;
Fig. 16 shows a schematic diagram illustrating the search order among the sub-regions of a multitree subdivision according to an embodiment, with the sub-regions that are the object of intra-prediction;
Fig. 17 shows a block diagram of a decoder according to an embodiment;
Fig. 18a-c show a schematic diagram that illustrates different possibilities of subdivisions according to other embodiments;
Fig. 19 shows a block diagram of an encoder according to an embodiment;
Fig. 20 shows a block diagram of a decoder according to another embodiment, and
Fig. 21 shows a block diagram of an encoder according to another embodiment.
In the following description of the figures, elements that occur in several of those figures are indicated by common reference numbers and a repeated explanation of these elements is avoided. Instead, explanations regarding an element shown within a fig. it also applies to other figures, where the respective element occurs while the explanation presented with those other figures, indicate their deviations.
In addition, the following description begins with modalities for conducting a
11/77 encoder and decoder, which are explained in relation to figs. 1 to 11. The embodiments described in relation to these figures combine several aspects of the present patent application which, however, would also be advantageous if implemented individually within a coding scheme and, therefore, with respect to the subsequent figures, the embodiments that are briefly discussed to explore aspects only mentioned individually with each of these embodiments representing an abstraction of the embodiments described in relation to figs. 1 and 11 in a different direction.
Fig. 1 shows an encoder according to an embodiment of the present invention. The encoder 10 of Fig. 1 comprises a predictor 12, a residual precoder 14, a residual reconstructor 16, an insert data stream 18 and a block divider 20. The encoder 10 is for the encoding of a spatially temporal information signal collected in a data stream 22. The spatially sampled temporal information signal can be, for example, a video, that is, a sequence of images. Each image represents an array of image samples. Other examples of temporal spatial signal information include, for example, images captured by depth, for example, light time cameras. In addition, it should be noted that a spatially sampled information signal may comprise more than one arrangement by time stamp or frame, such as in the case of a color video, which comprises, for example, an arrangement of luminance samples, along with two arrays of image chrominance samples. It may also be possible that the time sampling rate for the different components of the information signal, that is, luminance and chrominance, may be different. The same applies to spatial resolution. A video can also be accompanied by spatially sampled information, such as depth or transparency information. The following description, however, will focus on transforming one of these arrangements for the sake of a better understanding of the main aspects of this application first with, then turning to the treatment of more than one plan .
The encoder 10 of Fig. 1 is configured to create data stream 22 so that the syntax elements of data stream 22 describe the images in a granularity located between entire images and the individual image samples. For this purpose, the partition 20 is configured to subdivide each image into simply connected regions 24 of different sizes 26. Next, these regions will simply be called blocks or subregions 26.
As will be described in more detail below, divider 20 uses a multi-tree subdivision system in order to subdivide image 24 for blocks 26 of different sizes. To be even more precise, the specific embodiments described
12/77 below in relation to Figs. 1-11 mainly use a quadtree subdivision. As will also be explained in more detail below, divisor 20 can, internally, comprise a concatenation of a sub-divisor 28 to subdivide the images 24 for the blocks 26 already mentioned followed by a merger 30 that allows the groups of combinations of these blocks 26, in order to obtain an effective subdivision or granularity, which is located between the non-subdivision of images 24 and the subdivision defined by the subdivisor 28.
As illustrated by the dashed lines in fig. 1, indicator 12, residual precoder 14, residual reconstructor 16 and data flow inserter 18 operate on the image of subdivisions defined by divisor 20. For example, as will be described in more detail below, predictor 12 uses a prediction of the subdivision defined by divisor 20, in order to determine for individual sub-regions of the subdivision prediction as to whether the respective sub-region is subjected to intra-image prediction or inter-image prediction the definition of the corresponding prediction parameters for the respective subregion , according to the selected prediction mode.
The residual pre-encoder 14, in turn, can use a residual subdivision of the images 24, in order to encode the residue of the prediction of the images 24 provided by the predictor 12. How the residual reconstructor 16 reconstructs the production residue by syntax elements by the residual pre-encoder 14, the residual reconstructor 16 also operates in the aforementioned residual subdivision. The data flow inserter 18 can explore only the mentioned divisions, that is, the prediction and residual subdivisions, in order to determine the insertion order and proximity between the syntax elements for the insertion of the output by syntax elements by the pre - residual encoder 14 and the predictor 12 into the data stream 22 by means of, for example, entropy coding.
As shown in fig. 1, encoder 10 includes an input 32 where the original information signal enters encoder 10. A sub-extractor 34, residual precoder 14 and data flow inserter 18 are connected in series in the order mentioned between the input 32 and the output of the insert data stream 18 in which the encoded data stream 22 is the output. Subextractor 34 and residual precoder 14 are part of a forecast circuit that is closed by residual constructor 16, adder 36 and predictor 12 that are connected in series in the order mentioned between the output of residual precoder 14 and to the inverter input of subextractor 34. The output of the predictor 12 is also connected to the other input of the adder 36. In addition, the indicator 12 comprises an input directly connected to the input 32 and can comprise another input, also connected to the output of the adder 36 via of an option in the filter circuit 38. In addition, the predictor 12 generates lateral information during operation and, therefore, an output from the predictor 12 is also coupled to the flow data entry device 18. Likewise, the divider 20 comprises an output that is connected to another input of data flow of
13/77 insertion 18.
Having described the structure of the encoder 10, the mode of operation is described in more detail below.
As described above, divisor 20 decides for each image 24 how to subdivide it into 26 subregions. According to a subdivision of the image 24 to be used for the prediction, the predictor 12 decides for each subregion corresponding to this subdivision, how to predict the respective subregion. Predictor 12 produces the sub-region prediction for the inverting input of subextractor 34 and for the other input of adder 36 and produces prediction information that reflects the way that predictor 12 obtained from this predicts previously encoded portions of video, for the inserter of the data stream 18.
At the output of sub-extractor 34, the residual prediction is thus obtained, in which the residual pre-decoder 14 processes this residual prediction, according to a residual subdivision also prescribed by divisor 20. As described in more detail below with respect to Figs. 3 to 10, the residual image subdivision 24 used by the residual precoder 14 can be related to the prediction subdivision used by the predictor 12, so that each prediction subregion is adopted as a residual subregion or further subdivided into smaller residual subregions. However, fully independent prediction and residual subdivisions would also be possible.
Residual precoder 14 submits each residual subregion for a spatial to spectral domain transformation by a two-dimensional transformer followed by, or inherently involving, a quantization of the transformation coefficients resulting from the resulting transformation blocks where the distortion results from the quantization noise. The data stream inserter 18 can, for example, losslessly encode the syntax elements that describe transformation coefficients mentioned above in the data stream 22 using, for example, entropy coding.
The residual reconstructor 16, in turn, converts, by using a requantization followed by a re-transformation, the transformation coefficients into a residual signal, in which the residual signal is combined with the adder 36 with the forecast used by subextractor 34 for obtain the residual prediction, thus obtaining a reconstructed part or sub-region of a current image at the output of the adder 36. Predictor 12 can use the image of the reconstructed subregion for intra-prediction directly, which is to predict a certain prediction of the subregion by extrapolation from the prediction of previously reconstructed neighboring subregions. However, an intra-prediction carried out with the spectral domain foreseeing the spectrum of the current sub-region from a neighbor, directly, would also be theoretically possible.
14/77
For inter-prediction, predictor 12 can use images previously encoded and reconstructed in a version according to which they were filtered by an optional ring filter 38. Ring filter 38 can, for example, comprise a blocking filter or a adaptive filter that has a transfer function adapted to advantageously form the previously mentioned quantization noise.
Predictor 12 chooses the prediction parameters that reveal the way to predict a determined prediction of the sub-region through the use of a comparison with the original samples within the image 24. The prediction parameters can, as described in more detail below, it includes, for each sub-region of the prediction the indication of the prediction mode, such as the prediction of intra prediction and inter prediction of image. In the case of intra-image prediction, the prediction parameters may also include an indication of an angle along which the edges of the sub-region to be predicted in the form of intra prediction mainly extends, and in the case of inter-prediction image, motion vectors, moving image indices and, eventually, higher order parameters of motion transformation and, in the case of intra and / or inter image prediction, optional filter information for filtering the reconstructed image samples based on which the prediction of the current sub-region is predicted.
As will be described in more detail below, said subdivisions defined by a divisor 20 substantially influences the maximum rate / distortion ratio achievable by residual precoder 14, 12 and the insertion data flow predictor 18. In the case of a subdivision very thin, the prediction parameters of output 40 per predictor 12 to be inserted in data stream 22 require a very large encoding rate although the prediction obtained by predictor 12 may be better and the residual signal to be encoded by the precoder residual 14 can be smaller, so that it can be encoded by fewer bits. In the case of a very thick subdivision, the opposite is applicable. In addition, only the aforementioned thinking also applies to residual subdivision in a similar way: transforming an image using a finer granularity of the individual transformation blocks leads to less complexity in calculating transformations and greater spatial resolution of the transformation resulting. That is, smaller residual subregions allow for spectral distribution of content within each residual subregion to be more consistent. However, the spectral resolution is reduced and the ratio between significant and insignificant, that is, quantized to zero, the coefficients worsen. That is, the transformation granularity must be adapted to the image content locally. In addition, regardless of the positive effect of a granularity finder, a finer granularity regularly increases the amount of lateral information needed, in order to indicate the subdivision chosen for the decoder. As will be described in more detail below, the embodiments described below provide the
15/77 encoder 10, with the ability to adapt subdivisions very effectively to the content of the information signal to be encoded and to signal the subdivisions to be used for lateral decoding to instruct the data flow inserter 18 to enter the subdivision information for the encoded data stream 22. Details are shown below.
However, before defining the splitter subdivision 20 in more detail, a decoder according to an embodiment of the present patent application is described in more detail with reference to fig. 2.
The decoder of FIG. 2 is indicated by the reference signal 100 and comprises an extractor 102, a divider 104, a residual reconstructor 106, an adder 108, an indicator 110, an optional ring filter 112 and an optional post-filter 114. Extractor 102 receives the flow of data encoded in an input 116 of decoder 100 and extract from the subdivision information of the encoded data stream 118, the prediction parameters 120 and residual data 122 that the extractor processes 102 for the image divider 104, predictor 110 and residual reconstructor 106 , respectively. The residual reconstructor 106 has an output connected to a first input of the adder 108. The other input of the adder 108 and its outputs are connected to a prediction circuit to which the optional ring filter 112 and the predictor 110 are connected in series in the order mentioned with a passage route leading from the outlet of the adder 108 to the predictor 110 in a manner similar to the above-mentioned connections between the adder 36 and the predictor 12 in Fig. 1, that is, one for the prediction of intra image and the other for inter image prediction. Both the output of the driver 108 or the output of the ring filter 112 can be connected to an output 124 of the decoder 100, in which the reconstructed information signal is sent to a playback device, for example. An optional post-filter 114 can be connected within the path leading to output 124 in order to improve the visual quality of the visual impression of the reconstructed signal at output 124.
In general, the residual reconstructor 106, the adder 108 and the predictor 110 act as elements 16, 36 and 12 in fig. 1. In other words, it even emulates the functioning of the elements already mentioned in fig. 1. For this purpose, the residual reconstructor 106 and the predictor 110 are controlled by the prediction parameters 120 and the subdivision prescribed by the image divider 104 according to the subdivision information 118 from the extractor 102, respectively, in order to predict the prediction of subregions in the same way as the predictor 12 did or decided to do, and in order to retransform the transformation coefficients received at the same granularity as the residual precoder 14 did. The image divider 104, in turn, reconstitutes the subdivisions chosen by divider 20 of fig. 1 in a synchronized manner, based on information from subdivision 118. The extractor can, in turn, use information from the subdivision in order to control data extraction,
16/77 such as, in terms of context selection, neighborhood determination, probability estimation, analyzing data flow syntax, etc.
Several deviations can be made on the above-mentioned embodiments. Some are mentioned in the detailed description below in relation to the subdivision made by subdivisor 28 and the merger carried out by melting 30 and others are described in relation to Figs. 12 to f6 subsequent. In the absence of any obstacles, all of these deviations can be applied, individually or in subsets to the above-mentioned description of Fig. 1 and Fig. 2, respectively. For example, dividers 20 and 104 cannot determine a prediction subdivision and a residual image subdivision only. Instead, they can also determine a subdivision for the optional ring filter 38 and 112, respectively, either independent or dependent on the other subdivision for predictive or residual coding, respectively. In addition, the determination of the subdivision of or subdivisions by these elements cannot be carried out on an image by image basis. Instead, a particular subdivision or subdivisions for a given image can be reused or adopted for a given number of image sequences only then by transferring a new subdivision.
In providing more details on dividing the images into sub-regions, the following description focuses first on the part of the subdivision where subdivisor 28 and 104a takes responsibility. Next, the fusion process where fuser 30 and fuser 104b take responsibility is described. Finally, the adaptation / prediction of the interplane is described.
The way in which subdivisor 28 and 104 divide the figures is such that an image is divided into a number of blocks of different sizes, possibly for the purpose of predictive and residual encoding of the image or video data. As mentioned earlier, an image 24 can be available as one or more array of image sample values. In the case of the YUV / YCbCr color space, for example, the first arrangement can represent the luminance channel, while the other two arrangements represent chrominance channels. These arrangements can have different dimensions. All arrangements can be grouped into one or more plan groups, with each plan group consisting of one or more consecutive plans where each plan is contained in one and only one plan group. The following applies to each plan group. The first arrangement of a particular plan group can be called the main arrangement of this plan group. The following possible arrangements are subordinate arrangements. The division of blocks from the primary layout can be done based on a quadtree approach as described below. The division into blocks of the subordinate arrangements can be derived based on the division of the main set.
According to the realization modalities described below, the subdivisors 28
17/77 and 104a are configured to divide the main arrangement into a number of square blocks of equal size, the so-called tree blocks below. The length of the edge of the tree blocks is typically a power of two, such as 16, 32 or 64, when quadtrees are used. In addition, however, it should be noted that the use of other types of trees would be possible, as well as trees or binary trees with any number of leaves. Furthermore, the number of descending tree can be varied depending on the level of the tree and, depending on what the sign of the tree represents.
In addition, as mentioned above, the sample array can also represent other video sequence information, such as depth maps or light fields, respectively. For simplicity, the following description focuses on quadtrees as a representative example for multi-trees. Quadtrees are trees that have exactly four descendants on each inner node. Each of the tree blocks constitutes a primary quadtree together with subordinate quadtrees on each of the leaves of the primary quadtree. The primary quadtree determines the subdivision of a data for the prediction of the tree block while a subordinate quadtree determines the subdivision of a prediction block made for the purpose of residual coding.
The root node of the primary quadtree corresponds to the complete tree block. For example, fig. 3a shows a tree block 150. It should be remembered that each image is divided into a regular grid of rows and columns of tree blocks 150 so that, for example, it covers the range of samples without intervals. However, it should be noted that for all block subdivisions shown below, the constant subdivision without overlap is not critical. Instead, the neighboring block can overlap with each other, as long as no block of leaves is a suitable sub-portion of a neighboring block of leaves.
Along the quadtree structure for tree block 150, each node can be further divided into four descending nodes, which, in the case of the primary quadtree means that each tree block 150 can be divided into four sub-blocks with half the width and half the height of the tree block 150. In fig. 3a, these sub-blocks are indicated with reference numbers 152a to 152d. Likewise, each of these sub-blocks can be further divided into four smaller sub-blocks, half the width and half the height of the first sub-blocks. In fig. 3d this is shown in an exemplary way for sub-block 152c, which is sub-divided into four small sub-blocks 154a to 154d. Insofar as Figs. 3a to 3c show how an exemplary tree block 150 is first divided into its four sub-blocks 152a to 152d, then the lower left sub-block 152c is further divided into four small sub-blocks 154a to 154d and finally, as shown in Fig. 3c, the upper right block 154b of these small sub-blocks is again divided into four blocks of a width and height of the original eighth tree block of 150, with these
18/77 smaller blocks until denoted 156a to 156d.
Fig. 4 shows the structure of the base tree for the exemplary quadtree division, as shown in Figs. 3a-3d. The numbers next to the tree nodes are the values of a so-called subdivision flag, which will be explained in detail later when discussing the signaling of the quadtree structure. The root node of the quadtree is represented at the top of the figure (labeled Level 0). The four branches at level 1 of this root node correspond to the four sub-blocks, as shown in the figure. 3a. As the third among these sub-blocks is subsequently sub-divided into its four sub-blocks, in fig. 3b, the third node at level 1 in Fig.4 also has four branches. Again, the corresponding to the subdivision of the second (top right) descending node in fig. 3c, there are four sub-branches related to the second node at level 2 of the quadtree hierarchy. The nodes at level 3 are no longer subdivided.
Each sheet of the primary quadtree corresponds to a block of variable size for which individual prediction parameters can be specified (ie, the intra or inter prediction mode, motion parameters, etc.). These blocks are then called blocks of prediction. In particular, these sheet blocks are the blocks shown in Fig. 3c. With brief reference prior to the description of the Figures. 1 and 2, divisor 20 or subdivisor 28 determines the quadtree subdivision as just explained. Subdivisor 152a-d makes the decision that of tree blocks 150, sub-blocks 152a-d, small sub-blocks 154a-and so on, for further subdivision or partition, in order to find an optimal compensation between a forecast very thin subdivision and a very thick subdivision forecast, as already indicated above. Predictor 12, in turn, uses the prescribed subdivision prediction in order to determine the prediction parameters mentioned above at a granularity depending on the subdivision prediction or for each of the prediction sub-regions represented by the blocks shown in Fig. 3c, for example.
The prediction blocks shown in fig. 3c can be further divided into smaller blocks for the purpose of residual coding. For each prediction block, that is, for each leaf node of the primary quadtree, the corresponding subdivision is determined by one or more subordinate quadtree (s) for residual coding. For example, by allowing the maximum 16x16 residual block size, the 32 x 32 prediction block can be divided into four 16x16 blocks, each of which is determined by a subordinate quadtree for residual coding. Each 16 x 16 block, in this example, corresponds to the root node of a subordinate quadtree.
As described for the case of subdividing tree block data into prediction blocks, each prediction block can be divided into a number of residual blocks by using subordinate quadtree composition (s). Each leaf of a quadtree
Subordinate 19/77 corresponds to a residual block from which individual residual coding parameters can be specified (that is, transform mode, transformation coefficients, etc.) by the residual precoder 14 that encodes the parameter control, in turn, residual reconstructors 16 and 106, respectively.
In other words, subdivisor 28 can be configured to determine, for each image, or for each group of images of a subordinate subdivision prediction and residual subdivision first, by dividing the image into a regular arrangement of tree blocks 150 , a recursive partitioning of the subset of these tree blocks by quadtree subdivision, in order to obtain the prediction of subdivision into forecast blocks that can be tree blocks if no separation occurred in the respective tree block, or the leaf blocks of the quadtree subdivision - with, then, yet another subdivision of a subset of these prediction blocks, in a similar way, by, if a prediction block is larger than the maximum size of the subordinate residual subdivision, first, separate the prediction block corresponding to a regular arrangement of tree sub-blocks with the then subdivision of a subset of these tree sub-blocks in accordance with quadtree subdivision procedure in order to obtain the residual blocks - which can be prediction blocks, if it does not divide into tree subblocks took place in the respective tree subblocks prediction block if no division into even smaller regions took place in respective sub-blocks, or the leaf blocks of the residual quadtree subdivision.
As briefly described above, the sub-divisions chosen for a primary arrangement can be mapped over subordinate arrangements. This is easy when considering subordinate arrangements of the same size as the primary arrangement. However, special measures must be taken when the dimensions of the subordinate arrangements differ from the size of the main arrangement. In general, the mapping of the main subdivision arrangement to the subordinate arrangements in case of different dimensions can be done by spatial mapping, that is, by spatial mapping of the block boundaries of the main subdivision arrangement to the subordinate arrangements. In particular, for each subordinate arrangement, there may be a scale factor in the horizontal and vertical direction, which determines the relationship between the dimension of the main arrangement for the subordinate arrangement. The division of the subordinate arrangement into sub-blocks of predictive and residual coding can be determined by the primary quadtree and the subordinate quadtree (s) of each of the co-installed tree blocks in the main arrangement, respectively, with the resulting tree blocks. subordinate arrangement being scaled by the relative scale factor. In the case of scale factors in different horizontal and vertical directions (for example, as in chrominance 4: 2: 2 sub-sampling), the resulting forecast and residual blocks of the subordinate arrangement would not be squares anymore. In this case, it is possible to predetermine, that is,
20/77 adaptively select (either for the entire sequence, an out-of-sequence image or for each single or residual forecast) whether the residual non-square block should be divided into square blocks. In the first case, for example, the encoder and decoder may agree to subdivide into square blocks at a time where a mapped block is not squared. In the second case, subdivisor 28 can indicate the data flow by means of insert selection 18 and 22 of subdivisor data flow 104a. For example, in the case of chrominance 4: 2: 2 sub-sampling, in which the subordinate arrangements are half the width, but the same height as the main arrangement, the residual blocks would be twice as wide. Through vertical division, this block could obtain two square blocks again.
As mentioned above, subdivisor 28 or divisor 20, respectively, signals the quadtree division via data stream 22 to sub-divider 104a. For this purpose, subdivisor 28 informs the insertion flow data 18 about the subdivisions chosen for images 24. The data flow inserter, in turn, transmits the structure of the primary and secondary quadtree and, therefore, the division of the image arrangement in blocks of variable size of prediction or encoding of residues within the bitstream data stream or 22, respectively, for lateral decoding.
The minimum and maximum allowable block sizes are transmitted as side information and can be changed from image to image. Or the minimum and maximum allowable block sizes can be fixed on the encoder and decoder. These minimum and maximum block sizes can be different for the prediction and residual blocks. For signaling the quadtree structure, the quadtree has to be traversed and for each node it has to be specified whether this particular node is a quadtree leaf node (ie the corresponding block is not further subdivided), or, if it is branched in its four descending nodes (that is, the corresponding block is divided into four half-sized sub-blocks).
Signaling within an image is done by a tree block in a raster scan order as from left to right and from top to bottom, as shown in fig. 5a to 140. This scan order can also be different, such as from the bottom right side up, left or in the chess sense. In a preferred embodiment, each tree block and, therefore, each quadtree is traversed in first order depth to signal the subdivision information.
In a preferred embodiment, not only the subdivision information, that is, the tree structure, but also the prediction data, etc., that is, the payload associated with the leaf nodes of the tree, is transmitted / processed in depth. first order. This is done because the first transversal order of depth has great advantages over the first order of width. In fig. 5b, a quadtree structure is
21/77 presented with the leaf nodes marked a, b ..... j. Fig. 5a shows the resulting block division. If the blocks / nodes of the sheet are traversed in the first order of width, we obtain the following order: abjchidefg. In order of depth, however, the order is abc ... ij. As can be seen from fig. 5a, in the first order of depth, the left and upper neighboring block of the neighboring block is always transmitted / processed before the current block. Thus, the prediction of the motion vector and context modeling can always use the parameters specified for the left and upper neighboring block, in order to obtain a better coding performance. For the first order of width, this will not be the case, since block j is transmitted before blocks e, g, and i, for example.
Therefore, signaling for each tree block is done recursively along the quadtree structure of the primary quadtree in such a way that a signal is transmitted to each node, indicating whether the corresponding block is divided into four sub-blocks. If this flag has a value of 1 (for true), then this signaling process is repeated recursively for all four descending nodes, that is, the sub-blocks in order of raster scanning (upper left, upper right, lower left, bottom right corner) until the primary quadtree leaf node is reached. Note that a leaf node is characterized by having a subdivision flag with a value of 0 “. For the case where a node resides at the lowest hierarchy level of the primary quadtree and therefore corresponds to the smallest allowable prediction block size, no flags from the subdivision have to be transmitted. For the example in fig. 3a-c, first a transmission of 1, as shown in 190 in fig. 6a, specifying that the tree block 150 is divided into its four sub-blocks 152a-d. Then, you can recursively encode the subdivision information for all four sub-blocks 152a-d in order of raster verification 200. For the first two sub-blocks 152a, ba would transmit 0, indicating that there are no sub-blocks -divided (see 202 in Figure 6a). For the third sub-block 152c (bottom left), 1 can be transmitted, indicating that this block is sub-divided (see 204 in fig. 6a). Now, according to the recursive approach, the four sub-blocks 154a-d of this block would be processed. Here, 0 can be transmitted for the first sub-block (206) and 1 for the second (upper right) sub-block (208). Now, the four smaller block size blocks 156a-d in fig. 3c would be processed. In this case, it has already reached the smallest block size allowed in this example, there is more data that would have to be transmitted, since an additional subdivision is not possible. Otherwise, 0000, indicating that none of these blocks are divided, would be transmitted as indicated in figs. 6a to 210. After that, 00 would be transmitted to the two lowest blocks, in fig. 3b (see 212 in fig. 6a), and finally 0 for the lower right corner block in fig. 3a (see 214). Thus, the binary chain representing the complete quadtree structure would be illustrated in fig. 6th.
22/77
The different background shading in this binary sequence in the representation of the figure. 6a correspond to different levels in the quadtree hierarchy based on subdivision. Shading 216 represents level 0 (corresponding to a block size equal to the original tree block size), shading 218 represents level 1 (corresponding to a block size equal to half the size of the original tree block), the shading 220 represents level 2 (corresponding to a block size equal to a quarter of the size of the original tree block), and 222 represents the level 3 shading (corresponding to a block size equal to one eighth of the block size original tree). All flags of the subdivision of the same hierarchy level (which corresponds to the size of the same block and the same color as in the exemplary binary sequence representation) can be encoded by entropy and using one or more of the same insertion probability model 18, for example.
Note that, in the case of a first amplitude path, the subdivision information would be transmitted in a different order, shown in fig. 6b.
Similar to the subdivision of each tree block for forecasting purposes, the division of each resulting prediction block into residual blocks has to be transmitted in the bit stream. In addition, there can be a maximum and minimum block size for the residual encoding that is transmitted as side information and that can change from image to image. Or the maximum and minimum value for the size of the residual coding block can be set in the encoder and decoder. At each node of the primary leaf of the quadtree, as shown in fig. 3c, the corresponding prediction block can be divided into residual blocks of the maximum allowable size. These blocks are the root nodes that make up the subordinate quadtree structure for residual coding. For example, if the maximum residual block size for the image is 64 x 64 and the prediction block is 32x32 size, then the entire prediction block that would correspond to a subordinate root node of the size (residual) quadtree 32x32. On the other hand, if the maximum residual block size for the image is 16 x 16, then the 32x32 prediction block would consist of four residual quadtree root nodes, each 16x16 in size. Within each prediction block, the signaling of the subordinate quadtree structure is done root by root node in the order of raster scanning (left to right, top to bottom). As in the case of the primary quadtree structure (prediction), for each node a flag is encoded, which specifies whether that particular node is divided into four descendant nodes. So, if this flag has a value of 1, this procedure is repeated recursively for all four corresponding descending nodes and their corresponding sub-blocks in the order of the raster scan (upper left, upper right, lower left, lower right) until that a leaf node of the subordinate quadtree is reached. As in the case of the primary quadtree, it is not necessary to
23/77 signaling for the lowest level nodes in the subordinate quadtree hierarchy, since the nodes correspond to the blocks of the residual block of the smallest possible size, which cannot yet be divided.
For entropy coding, flags of the residual block subdivision belonging to residual blocks of the same block size can be encoded using a single model of the same probability.
Thus, according to the example shown above in relation to Figs. 3a through 6a, subdivisor 28 defined a main subdivision for prediction purposes and a subordinate subdivision of blocks of different sizes from the primary subdivision for residual coding purposes. The data stream inserter 18 encoded the primary subdivision by signaling for each tree block in a zigzag scanning order, a sequence of bits constructed according to fig. 6a, together with the maximum primary coding block size and the maximum hierarchy level of the main subdivision. For each prediction block thus defined, the associated prediction parameters were included in the data flow. Furthermore, an encoding of similar information, that is, the maximum size, the hierarchy level and maximum bit strings according to fig. 6a, was performed for each size prediction block which was equal to or less than the maximum size for the residual sub-division and for each residual tree root block in which the prediction blocks pre-divided the size that exceeds the maximum size defined for the residual blocks. For each residual block thus defined, residual data is inserted into the data stream.
Extractor 102 extracts the respective bit strings from the data stream at input 116 and informs the dividers 104 about the subdivision information thus obtained. In addition, the flow data inserter 18 and extractor 102 can use the aforementioned order between the prediction blocks and the residual blocks to transmit additional syntax elements, such as residual data output by the residual precoder 14 and output the prediction of parameters by the predictor 12. Using this order has advantages in that the appropriate contexts for encoding the individual syntax elements for a given block can be chosen through the exploration of already encoded / decoded syntax elements of neighboring blocks. Furthermore, similarly, the residual pre-decoder 14 and the predictor 12, as well as the residual reconstructor 106 and the pre-decoder 110 can process the individual prediction and residual blocks in the order described above.
Fig. 7 shows a flow diagram in stages, which can be performed by the extractor 102, in order to extract information from the subdivision of the data flow 22, when coded in the manner as described above. In a first step, extractor 102 divides the image into blocks 24 of root trees 150. This step is indicated as step 300 in the
24/77 fig. 7. Step 300 can involve extractor 102 by extracting the maximum size of the prediction block from data stream 22. In addition or alternatively, step 300 can involve extractor 102 to extract the maximum level of hierarchy from the data stream 22.
Then, in a step 302, an extractor 102 decodes the flag or bit of the data stream. Step 302 is executed for the first time, the extractor 102 knows that the corresponding flag is the first flag in the sequence of bits belonging to the first root block of tree 150 in the scan order of the tree root block 140. As this flag is a hierarchy level 0 flag, extractor 102 can use a context associated with hierarchy level 0 modeling in step 302, in order to determine a context. Each context can have a corresponding probability estimate for the entropy decoding flag associated with it. The estimation of the probability of contexts in an individual context can be adapted to the respective statistic of the symbol context. For example, in order to determine an appropriate context for decoding the hierarchy level 0 flag in step 302, extractor 102 may select a context from a set of contexts, which is associated with that hierarchy level 0, depending on level 0 hierarchy of the flag of neighboring tree blocks, or even, depending on the information contained within the bit chains that define the quadtree subdivision of neighboring tree blocks of the currently processed tree block, such as the upper and left neighboring tree block.
In the next step, that is, step 304, the extractor 102 checks whether the newly decoded flag suggests partitioning. If this is the case, the extractor 102 partitions the current block - present as a tree block - or indicates that partitioning for subdivisor 104a in step 306 and checks, in step 308, whether the level of flow hierarchy was equal to the level maximum hierarchy minus one. For example, extractor 102 may, for example, also have the maximum hierarchy level extracted from the data stream in step 300. If the current hierarchy level is uneven at the maximum hierarchy level minus one, extractor 102 increases the level of current hierarchy by 1 in step 310 and back to step 302 to decode the next data stream flag. This time, the flags to be decoded in step 302 belong to another level of hierarchy and, therefore, according to an embodiment, the extractor 102 can select one of the different sets of contexts, the set that belongs to the current hierarchy level . The selection can also be based on bit subdivision sequences according to fig. 6a of neighboring tree blocks having already been decoded.
If a flag is decoded, and the check in step 304 reveals that this flag does not suggest a division of the current block, extractor 102 continues with step 312 to check whether the current hierarchy level is 0. If this is the case, the extractor 102 continues processing in relation to the next block of the tree root in the order of
25/77 digitization 140 in step 314, or interrupts the processing of extracting information from the subdivision, if there is no block from the root of the tree to be processed to the left.
It should be noted that the description in fig. 7 focuses on decoding the subdivision indication flags of the subdivision prediction only, so that, in fact, step 314 may involve decoding additional boxes or related syntax elements, for example, for the current tree block . In any case, if another or a next root block of the tree exists, the extractor 102 proceeds from step 314 to step 302 to decode the next flag from the subdivision information, that is, the first flag of the sequence of the indicator in relation to the new root block of the tree.
If, in step 312, the hierarchy level ends up being different from 0, the operation proceeds in step 316 with a check to see if other descendant nodes belonging to the current node exist. That is, when the extractor 102 performs the verification in step 316, it has already been verified in step 312 that the current hierarchy level is a different hierarchy level than the hierarchy level 0. This, in turn, means that there is a parent node, which belongs to a root block of the tree 150 or one of the smaller blocks 152a-d, or even smaller blocks 152a-d, and so on. The tree structure node, to which the newly decoded flag belongs, has a parent node, which is common to three additional nodes in the current tree structure. The digitization order between descendant nodes such as a common parent node was illustrated in exemplary figure 3 for level 0 of hierarchy with reference signal 200. Thus, in step 316, extractor 102 controls the fact that each of these four descendant nodes have already been visited within the scope of the process of fig. 7. If this is not the case, that is, if there are no additional descendant nodes with the current parent node, the process of fig. 7 proceeds to step 318, where the next descending node according to a zigzag scan order 200 within the current hierarchy level is visited, so that its corresponding subblock now represents the process in process block 7 and, thereafter, a flag is decoded at step 302 from the data stream relative to the current block or the current node. If, however, there are no additional child nodes for the current parent node in step 316, the process of fig. 7 proceeds to step 320, where the current hierarchy level is decreased by 1, whereupon the process proceeds with step 312.
When performing the steps shown in fig. 7, extractor 102 and subdivisor 104a cooperate to retrieve the chosen lateral subdivision of the encoder from the data stream. The process of fig. 7 is concentrated in the case described above, the subdivision prediction. Fig. 8 shows, in combination with the flow diagram of fig. 7, as extractor 102 and subdivisor 104a cooperate to recover the residual subdivision of the flow of
26/77 data.
In particular, fig. Fig. 8 shows the steps performed by extractor 102 and subdivisor 104a, respectively, for each prediction block resulting from the prediction of the subdivision. These prediction blocks are traversed, as mentioned above, according to a zigzag analysis 140 the order among the 150 tree blocks of the subdivision prediction and the use of a first-order depth across each tree block currently visited to pass through the sheet blocks as shown, for example, in fig. 3c. According to the depth search order, the leaf blocks of partitioned primary tree blocks are visited in depth search order with the visit of sub-blocks of a certain hierarchical level having a common current node in the digitization order in zigzag 200 and mainly the digitization of the subdivision of each of these sub-blocks before proceeding to the next sub-block, in this order of scanning in zigzag 200.
For the example in fig. 3c, the resulting check order between the leaf nodes of the tree block 150 is shown with the reference signal 350.
For a currently visited prediction block, the process of fig. 8 starts at step 400. In step 400, an internal parameter indicating the current size of the current block is set equal to the size of hierarchy level 0 of the residual subdivision, that is, the maximum block size of the residual sub-division. It should be remembered that the maximum residual block size can be less than the smallest prediction block size in the subdivision or can be equal to or greater than the last. In other words, according to an embodiment, the encoder is free to choose any of the possibilities just mentioned.
In the next step, that is, step 402, a check is made if the block size of the currently visited prediction block is larger than the parameter that indicates the current internal size. If this is the case, the currently visited prediction block, which can be a subdivision prediction sheet block or a subdivision prediction tree block, which has not been further partitioned, is larger than the maximum size residual blocking and, in this case, the process of fig. 8 proceeds with step 300 of the figure. 7. That is, the currently visited prediction block is divided into residual tree root blocks and the first sequence flag of the first residual tree block flag within this currently visited prediction block is decoded in step 302, and so on. against.
If, however, the currently visited prediction block has a size equal to or less than the internal parameter indicating the current size, the process of fig. 8 proceeds to step 404, where the size of the prediction block is checked to determine if it is equal to the internal parameter indicating the current size. If this is
27/77 the case, the splitting step 300 can be omitted and the process proceeds directly with step 302 of fig. 7.
If, however, the prediction block size of the currently visited prediction block is smaller than the internal parameter indicating the current size, the process of fig. 8 proceeds with step 406, where the hierarchy level is increased by 1 and the current size is adjusted to the size of the new hierarchy level, as well as divided by 2 (in both directions of the axis in the case of quadtree subdivision). After that, the verification of step 404 is performed again. The effect of the circuit formed by steps 404 and 406 is that the level of hierarchy always corresponds to the size of the corresponding blocks to be partitioned and regardless of whether the prediction of the respective block was less than or equal to / greater than the maximum block size residual. Thus, during the decoding of the flags in step 302, the modeling context performed depends on the level of hierarchy and the size of the block to which the flag refers, at the same time. The use of different contexts of flags of different levels of hierarchy or block sizes, respectively, is advantageous in that the probability estimate may well fit the real probability distribution between occurrences of the flag value with, on the other hand, having a relatively moderate number of contexts to be managed, thereby reducing the burden of the management context, as well as increasing the adaptation of the context to the statistics of the current symbols.
As noted above, there can be more than one set of samples and these sample arrangements can be grouped into one or more plan groups. The input signal to be encoded, indicating input 32, for example, can be an image of a video sequence or the still image. The image can thus be administered in the form of one or more sample arrangements. In the context of encoding an image from a video sequence or from the still image, the sample arrangements may refer to the three color planes, such as red, green and blue luminance or plane chrominance and, as in representations of YUV or YCbCr color. In addition, sample arrangements representing alpha, that is, transparency, and / or 3-D depth information of video material may also be present. A number of these sample arrangements can be grouped together as a so-called plan group. For example, luminance (Y) can be a plane group with only one sample and chrominance arrangement, such as CbCr, it can be another plane group with two sample arrangements or, in another example, YUV can be a group with three plan arrangements and depth information for 3-D video material can be a different plan group with a single sample arrangement. For each plan group, a primary quadtree structure can be encoded within the data stream 22 to represent the division into prediction blocks and for each prediction block, a secondary structure
28/77 quadtree represents the division into residual blocks. So, according to a first example that we just mentioned, where the luminance component is a plane group, while the chrominance component forms the other plane group, there would be a quadtree structure for the luminance plane prediction blocks, a quadtree structure for the residual luminance blocks, the plan of a quadtree structure for the chrominance plan prediction block and a quadtree structure for the residual chrominance plan blocks. In the second example mentioned earlier, however, there would be a quadtree structure for the luminance and chrominance prediction blocks together (YUV), a quadtree structure for the residual luminance and chrominance blocks together (YUV), a quadtree structure for the prediction of depth block information for 3-D video material and a quadtree structure for residual depth information blocks for 3-D video material.
In addition, in the previous description, the input signal was divided into prediction blocks using a quadtree primary structure and it was described how these prediction blocks were further subdivided into residual blocks using a subordinate quadtree structure. According to an alternative embodiment, the subdivision cannot end at the subordinate quadtree stage. That is, the blocks obtained from a division using the subordinate quadtree structure can be further subdivided with a tertiary quadtree structure. This division, in turn, can be used for the purpose of using the coding of new tools that can facilitate the coding of the residual signal.
The previous description focused on the subdivision made by subdivisor 28 and subdivisor 104a, respectively. As mentioned above, the subdivision defined by subdivisor 28 and 104a, respectively, can control the processing granularity of one of the aforementioned modules of encoder 10 and decoder 100. However, according to the embodiments described below, the subdividers 228 and 104a, respectively, are followed by a melting concentration 30 and 104b, respectively. It should be noted, however, that concentrations 30 and 104b are optional and can be left out.
In effect, however, and as will be described in more detail below, the concentration provides the encoder with the opportunity to combine some of the prediction blocks or residual blocks for groups or aggregates, so that the other, or at least some of the others modules can handle these groups of blocks together. For example, indicator 12 can sacrifice small deviations between the prediction parameters of some prediction blocks determined by optimization using the subdivision of subdivisor 28 and use prediction parameters common to all these blocks instead of prediction, if the group signaling of the prediction blocks together with a parameter
29/77 common transmission for all blocks belonging to this group are more promising in the sense of rate / proportion of distortion than signaling the prediction parameters individually for all the prediction blocks. The processing to recover the prediction of predictors 12 and 110, itself, based on these common prediction parameters can, however, still occur with block prediction. However, it is also possible that indicators 12 and 110 still perform the prediction process at once for the entire group of prediction blocks.
As will be described in more detail below, it is also possible that the group of prediction blocks is not only for the use of the same prediction parameters or common for a group of prediction blocks, but, alternatively, or, in addition, allows that encoder 10 sends a prediction parameter to this group, together with the prediction residues for prediction blocks that belong to this group, so that the signaling overhead for signaling the prediction parameters for this group can be reduced. In the latter case, the fusion process can only influence the insertion of the data stream 18, instead of the decisions made by the residual precoder 14 and the predictor 12. However, more details are presented below. To complete, however, it should be noted that only the aspect mentioned also applies to other subdivisions, such as the residual subdivision or the subdivision filter mentioned above.
First, the fusion of sample sets, as mentioned above and the prediction of residual blocks, is motivated, in a broader sense, that is, it is not limited to the aforementioned multi-tree subdivision. Subsequently, however, the description focuses on the merging of blocks resulting from the multi-tree subdivision for which realization modalities have just been described above.
In general, the fusion of the syntax elements associated with specific sets of samples for the purpose of transmitting associated encoding parameters allows to reduce the information rate of lateral image and video encoding applications. For example, the sample arrangements of the signal to be encoded are usually divided into particular sets of samples or sets of samples, which can represent rectangular or square blocks, or any other sample collection, including regions arbitrarily shaped like triangles, or other ways. In the embodiments described above, the regions simply connected were the prediction blocks and the residual blocks resulting from the multi-tree subdivision. The subdivision of sample arrangements can be fixed by syntax or, as described above, the partial division can be, at least partially, signaled within the bit stream. To maintain the lateral information rate for signaling small subdivision information, the syntax generally allows only a limited number of choices, resulting in
30/77 in simple separation, such as the subdivision of blocks to smaller blocks. Sample sets are associated with certain encoding parameters, which can specify prediction information or residual encoding modes, etc. Details on this subject have been described above. For each sample set, the 5 individual coding parameters, such as for the residual prediction and / or coding specification can be transmitted. In order to obtain better coding efficiency, the aspect of the fusion described below, that is, the fusion of two or more sample sets in the so-called groups of sample sets, allows for some advantages, which are described below. For example, sets of samples can be combined in such a way that the entire fixed sample of a given group of parts of the same coding parameters, which can be transmitted together with a sample fits into the group. In doing so, the encoding parameters do not have to be transmitted for each sample set in a group of sample sets individually, but instead, the encoding parameters are transmitted only once for the entire group of samples. sample sets. As a result, the rate of lateral information for the transmission of the coding parameters can be reduced and the overall coding efficiency can be improved. As an alternative approach, further refinement of one or more coding parameters can be transmitted to one or more sample sets in a group of sample sets. Refinement 20 can be applied to all sample sets in a group or only to the sample set to which it is passed.
The fusion aspect described below also gives the encoder greater freedom in creating the bit stream 22, since the fusion approach significantly increases the number of possibilities for selecting a partitioning for the 25 sample arrangements of an image. Since the encoder can choose from more options, such as minimizing the particular measurement / distortion rate, the encoding efficiency can be improved. There are several possibilities for operating an encoder. In a simple approach, the encoder could first determine the best subdivision of the sample arrangements. In a brief reference to fig. 1, subdivisor 28 can determine the optimal subdivision in a first phase. Subsequently, it can be verified, for each set of samples, whether a merger with another set of samples or another set of sample sets reduces a particular cost measure of rate / distortion. With this, the prediction parameters associated with a group resulting from the fusion of sample sets can be re-estimated, such as by performing a new motion scan or the prediction parameters that have been determined for the sample set. common and the candidate sample set or group of sample sets for the fusion can be assessed for the group considered
31/77 sample sets. In a more comprehensive approach, a cost measure, in particular rate / distortion, can be evaluated for additional candidate groups of sample sets.
It should be noted that the fusion approach described below does not change the order of processing of the sample sets. That is, the fusion concept can be implemented in a way that the delay is not increased, that is, each sample set remains decodable at the same time without using the fusion approach.
If, for example, the bit rate that is saved by reducing the number of encoded prediction parameters is greater than the bit rate that is about to be additionally spent for encoding fusion information for indicating fusion for the side decoding, the additional fusion approach to be described below results in increased coding efficiency. In addition, it should be mentioned that the syntax extension described for the merger provides the encoder with additional freedom in selecting the partitioning of an image or a group of planes in blocks.
In other words, the encoder does not restrict itself to subdividing it first, and then checks whether any of the resulting blocks have the same or similar set of prediction parameters. As a simple alternative, the encoder can first determine the subdivision according to a cost rate distortion measure and then the encoder can check, for each block, whether a merger 20 with one of its neighboring blocks or the group of blocks already determined reduces a cost measure of the distortion rate. With this, the prediction parameters associated with the new block group can be re-estimated, such as by performing a new motion scan or the prediction parameters that have been determined for the current block and the neighboring block or block groups can be evaluated for the new block group. The merger information can be signaled based on blocks. Effectively, the merger can also be interpreted as inference of the prediction parameters for a current block, in which the inferred prediction parameters are matched with the prediction parameters of one of the neighboring blocks. Alternatively, the waste can be transmitted by blocks within a group of blocks.
Thus, the fundamental idea underlying the concept of fusion described below is to reduce the bit rate that is necessary for the transmission of prediction parameters or other encoding parameters, merging neighboring blocks into a group of blocks, where each group of blocks it is associated with a single set of encoding parameters, such as prediction parameters or residual encoding parameters. The merger information is signaled within the bit stream in addition to the subdivision information, if present. The advantage of the fusion concept is an increase in coding efficiency resulting from a decrease in the lateral information rate for
32/77 coding. It should be noted that the fusion processes described here can also extend to dimensions other than spatial dimensions. For example, a group of sample sets or blocks, respectively, found within several different video images, can be merged into a group of blocks. Fusion can also be applied to 4-D compression and encoding field light.
Thus, briefly returning to the previous description of figs. 1 to 8, it should be noted that the fusion process subsequent to subdivision is advantageous regardless of the specific way that sub-dividers 28 and 104a, respectively, subdivide the images. To be more precise, the latter can also subdivide the images in a manner similar to, for example, H.264, that is, by dividing each sub-image into a regular arrangement of rectangular or square macroblocks of a predetermined size, such as 16 x 16 luminance samples or a size signaled within the data stream, each macroblock with certain coding parameters associated with it that comprises, inter alia, the partitioning parameters that define, for each macroblock, a breakdown into one regular subnet of 1, 2, 4 or some other number of dividers that serve as a prediction granularity and the corresponding prediction parameters in the data stream, as well as to define the residue separation and the corresponding residual transformation granularity.
In any case, fusion provides the aforementioned advantages discussed briefly, such as reducing the bit rate of side information in image and video encoding applications. A particular set of samples, which can represent rectangular or square blocks or arbitrarily shaped regions, or any other collection of samples, such as any simply linked region or samples are usually linked to a certain set of coding parameters and for each of sample sets, encoding parameters are included in the bit stream, encoding parameters that represent, for example, prediction parameters, which specify how the corresponding sample set is predicted using already encoded samples. The partitioning of the sample arrangements of an image into sets of samples can be fixed by syntax or can be signaled by the corresponding subdivision information within the bit stream. The coding parameters for the sample set can be transmitted in a pre-defined order, which is given by the syntax. According to the fusion functionality, the fusion 30 is able to signal, by a common set of samples or a flow block, such as a residual prediction block or block that is fused with one or more sample sets, for a group of sample sets. The coding parameters for a group of sample sets, therefore, need to be transmitted only once. In a particular embodiment, the parameters of
33/77 coding of a set of flow samples are not transmitted, if the set of flow samples is merged with a set of samples or an existing group of sample sets for which the coding parameters have already been transmitted. Instead, the coding parameters for the current sample set 5 are matched with the coding parameters of the sample set or a group of sample sets with which the current sample set is merged. As an alternative approach, further refinement of one or more coding parameters can be transmitted over a sample set. The refinement can be applied to all sample sets in a group or just to the sample set to which it is passed.
According to an embodiment, for each set of samples, such as a prediction block as mentioned above, a residual block as mentioned above, or a block of sheets from a multitree subdivision as mentioned above, the set of all previously encoded / decoded sample sets and 15 called a set of causal sample sets. See, for example, fig. 3c.
All blocks shown in this figure are the result of a particular subdivision, such as a prediction subdivision or a residual subdivision or any multitree subdivision, or similar, and the encoding / decoding order defined between these blocks is defined by the arrow 350. Considering a given block between these blocks 20 as the current sample set or simply connected current region, its set of causal sample sets is made up of all previous blocks in the current block along the order 350. However, it must again be remembered that another subdivision not using multi-tree subdivision would be possible, as well as the following discussion of the merging principles are reported.
The sample sets that can be used for fusion with a current set of samples are called the set of candidate sample sets to follow and are always a subset of the set of causal sample sets. The way the subset is formed can be known to the decoder, or it can be specified within the data stream or bit stream from the encoder to the decoder. If a given stream sample set is encoded / decoded and its set of candidate sample sets is not empty, it is signaled within the data stream in the encoder or derived from the data stream in the decoder if the set of samples is merged with a set of samples out of the set of candidate sample sets and, if so, with one of them. Otherwise, the joint cannot be used for this block, since the set of candidate sample sets is empty anyway.
There are different ways to determine how the subset of the set of
34/77 sets of causal samples constitute the set of candidate sample sets. For example, determinations of candidate sample sets can be based on a sample within the current sample set, which is uniquely geometrically defined, such as the upper left image sample of a rectangular or square block. From this unique geometrically defined sample, a number other than zero, in particular samples, is determined, which directly represents spatial neighbors of this unique geometrically defined sample. For example, in particular, the non-zero number of samples comprises the upper neighbor and the left neighbor of the geometrically defined exclusive sample of the current sample set, so that the non-zero number of neighboring samples can be a maximum of two , one, if one of the top and left neighbors is not available or is out of the image, or zero, in the case of both absent neighbors.
The set of candidate sample sets can then be determined to cover those sample sets that contain at least a non-zero number of neighboring samples just mentioned. See, for example, fig. 9a. The current sample of the set currently under consideration as being the object of fusion, must be block X and its sample geometrically uniquely defined, must be exemplarily the upper left sample indicated in 400. The upper and left neighboring samples of sample 400 are indicated in 402 and 404. The set of causal sample sets or set of causal blocks is highlighted in a shaded manner. Between these blocks, blocks A and B form one of the neighboring samples 402 and 404 and therefore these blocks form the set of candidate blocks or the set of candidate sample sets.
According to another embodiment, the set of candidate sample sets determined by a fusion issue may additionally or exclusively include sample sets that contain a non-zero number in particular of samples, which may be one or two, that have the same spatial location, but they are contained in a different image, namely, for example, a previously encoded / decoded image. For example, in addition to blocks A and B in fig. 9a, a block of a previously encoded image can be used, which comprises the sample in the same position as the sample 400. By the way, it should be noted that only the upper part of the neighboring sample 404, or merely for example the neighboring left 402 could be used to define the non-zero number of neighboring samples mentioned above. In general, the set of candidate sample sets can be derived from data previously processed within the current image or in other images. The derivation may include directional spatial information, such as transforming coefficients associated with a given direction and image gradients of the current image, or may include
35/77 directional temporal information, such as representations of neighboring movement. From such data available to the receiver / decoder data and the other side and information within the data stream, if present, the set of candidate sample sets can be derived.
It should be noted that the derivation of the candidate sample sets is performed in parallel by both fusers 30 on the side of the encoder and fusers 104b on the side of the decoder. As already mentioned, it can either determine the set of candidate sample sets independently of each other based on a predefined way known to both, or the encoder can signal hints within the bit stream, that 10 bring the fuser 104b under conditions of performing the derivation of these sets of candidate samples in the same way as the fusion form 30 on the side of the encoder determining the set of sets of candidate samples.
As will be described in more detail below, merging data stream 30 and inserting data stream 18 cooperate in order to transmit one or more elements of 15 syntax for each sample set, which specify whether the sample set is merged with another set of samples, which, in turn, may be part of an already merged group of sample sets, and which of the sample set is used for fusion candidates. Extractor 102, in turn, extracts these syntax elements and informs fuser 104b accordingly. In particular, according to the specific embodiment described below, one or two elements of syntax are transmitted to specify the fusion information for a given set of samples. The first syntax element specifies whether the current sample set is merged with another sample set. The second syntax element is only transmitted if the first syntax element specifies that the current set of samples is merged with another set of samples, specifying which set of candidate sample sets is employed for the merger. The transmission of the first syntax element can be suppressed if a set of derivatives from sets of candidate samples is empty. In other words, the first syntax element can only be passed if a set of derivatives of sets of candidate samples is not empty. The second 30 syntax element can only be passed if a set of candidate derived sample sets contains more than one set of samples, since if only one sample set is contained in the set of candidate sample sets, another selection it is not possible anyway. In addition, the transmission of the second syntax element can be suppressed if the candidate sample set 35 comprises more than one sample set, but if all sample sets in the candidate sample set are associated with the same sample set. encoding parameter. In other words, the second element of
36/77 syntax can only be transmitted if at least two sets of samples from a set of sets of candidate derived samples are associated with different encoding parameters.
Within the bit stream, the fusion information for a sample set can be encoded before the prediction parameters or other specific encoding parameters that are associated with that sample set. The prediction or coding parameters can only be transmitted if the information from the fusion object signals that the current sample set should not be merged with any other sample set.
The fusion information for a given set of samples, that is, a block, for example, can be encoded after an appropriate subset of the prediction parameters, or, in a broader sense, the encoding of parameters that are associated with the set of the respective sample, has been transmitted. The subset of prediction / encoding parameters can consist of one or more image reference indexes or one or more components of a motion parameter vector or a reference index and one or more components of a motion parameter vector , etc. The prediction subset or coding parameters already transmitted can be used to obtain a candidate sample set by establishing a larger provisional set of candidate sample sets, which may have been derived as described above. As an example, a measure of difference or distance according to a predetermined distance measure, between the prediction already coded and the coding parameters of the current sample set and corresponding prediction or the coding parameters of the preliminary set of sets candidate samples can be calculated. So, only the sample sets for which the calculated difference measure, or distance, is less than or equal to a predefined or derived threshold, and are included at the end, that is, a limited set of candidate sample sets . See, for example, fig. 9a. The current set of samples is of block X. A subset of the coding parameters that belong to this block will have already been inserted in data stream 22. Imagine, for example, block X was a prediction block, in which case the An appropriate subset of the coding parameters can be a subset of the prediction parameters for this block X, such as a subset of a set comprising an image reference index and mapping and motion information, such as a motion vector. If block X was a residual block, the subset of coding parameters is a subset of residual information, such as transformation coefficients, or a map that indicates the positions of the significant transformation coefficients in block X. Based on this information, both the insertion data stream 18 and the extractor 102 is able to use this information to determine a subset of blocks A and B, which forms,
37/77 in this specific embodiment, the previously mentioned preliminary set of candidate sample sets. In particular, since blocks A and B belong to the set of sets of causal samples, the encoding parameters of these are available for the encoder and decoder, at the time when the encoding parameters of X blocks are encoded / decoded. Therefore, the above comparison using the difference measure can be used to exclude any number of blocks from the initial set of candidate sample sets A and B. The resulting reduced set of candidate sample sets can then be used as described above, namely, in order to determine whether a fusion indicator indicating fusion should be transmitted or extracted from the data stream, depending on the number of sample sets in the set of candidate and reduced sample sets as to whether a second element of syntax has to be transmitted, or has to be extracted from the data stream with a element of syntax indicating that the second example defined within the reduced set of candidate sample sets must be the partner block for the merger. That is, the decision to merge or transmit respective syntax merge elements to a simply linked predetermined region may depend on the number of regions simply linked with the predetermined relationship to the simply linked predetermined location region, and, at the same time, it has the associated coding parameters that satisfy the predetermined relation for the first subset of the coding parameters for the simply linked predetermined region, as well as the adoption of prediction or extraction of the residual forecast can be performed on the second subset of the encoding parameters for the simply linked predetermined region. That is, only a subset of the coding parameters of an identification of the number of regions that have simply connected the relative relationship of predetermined location to the predetermined region simply linked, and at the same time, has the coding parameters that are associated with it that fulfill the predetermined relation to the first subset of coding parameters for the simply linked predetermined region, can be adopted from the second subset of the simply linked predetermined region, or can be used to predict the second subset of the simply linked predetermined region, respectively.
The aforementioned threshold against which the aforementioned distances are compared can be fixed and known by the encoder and decoder, or it can be derived based on the calculated distances, such as the mean of the difference values, or some other central trend or similar . In this case, the reduced set of candidate sample sets would be inevitable to proceed to an appropriate subset of the sample.
38/77 preliminary set of candidate sample sets. Alternatively, only the sample sets are selected from the preliminary set of candidate sample sets for which the distance according to the measured distance is minimized. Alternatively, exactly one sample set is selected from the initial sample set using the aforementioned distance measure sets candidate. In the latter case, the fusion information only needs to specify whether the current sample set should be merged with a single candidate from the sample set or not.
Thus, the set of candidate blocks can be formed or derived, as described below, with reference to fig. 9a. Starting from the position of the upper left sample 400 of the current block X in fig. 9a, its position of the neighboring left sample 402 and its position of neighboring upper sample 404 is derived in its sides of the encoder and decoder. The set of candidate blocks can therefore have only up to two elements, namely, the blocks of the set of causal shaded blocks in Fig. 9a which contain one of the two sample positions, which in the case of fig. 9a, are blocks B and A. Thus, the set of candidate blocks can have only two blocks directly adjacent to the position of the upper left sample of the current block, as their elements. According to another embodiment, the set of candidate blocks can be given by all blocks that were coded before the current block and contain one or more samples that represent the spatial direct neighbors of any sample in the current block. The direct spatial neighborhood can be restricted to target neighbors directly from the left and / or neighbors directly from the top and / or neighbors directly from the right and / or neighbors directly from the bottom of a sample of the current block. See, for example, fig. 9b showing another subdivision of the block. In this case, the candidate blocks comprise four blocks, called blocks A, B, C and D.
Alternatively, the set of candidate blocks may additionally or exclusively include blocks that contain one or more samples that are located in the same position, as any of the samples in the current block, but are contained in a different form, that is, image already encoded / decoded.
Even alternatively, the candidate set of blocks represents a subset of the sets of blocks described above, which were determined by the neighborhood in a spatial or temporal direction. The subset of candidate blocks can be fixed, flagged or derived. The derivation of the subset of candidate blocks can consider the decisions made by other blocks in the image or in other images. As an example, blocks that are associated with the same or very similar coding parameters that block other candidates cannot be included in the set of candidate blocks.
39/77
The following description of an embodiment applies to the case where only two blocks containing the left and upper neighbor example of the upper left sample of the current block are considered as potential candidates for the maximum.
If the candidate block set is not empty, a flag called Mergeflag is signaled, specifying whether the current block is merged with any of the candidate blocks. If the Merge_flag is equal to 0 (for false), this block is not merged with one of its candidate blocks and all encoding parameters are transmitted normally. If the Merge flag is equal to 1 (for true), the following applies. If the set of candidate blocks contains a single block, the candidate block is used for the merger. Otherwise, the set of candidate blocks contains exactly two blocks. If the prediction parameters of these two blocks are identical, these prediction parameters are used for the current block. Otherwise (the two blocks have different prediction parameters), a flag called mergejeft_flag is flagged. If the mergejeft_flag is equal to 1 (for true), the block with the position of the left sample next to the position of the upper left sample of the current block is selected from the set of candidate blocks. If merge_left_flag is equal to 0 (for false), the other (that is, upper neighborhood) block of the set of candidate blocks is selected. The prediction parameters of the selected block are used for the current block.
In summary, for some of the modalities described above in relation to the merger, reference is made to fig. 10 showing the steps taken by the extractor 102 to extract the fusion information from the data stream 22 of the input entry 116.
The process starts at 450 with the identification of candidate blocks or sample sets for a current sample set or blocks. It should be remembered that the coding parameters for the blocks are transmitted within the data stream 22, in an order of a given dimension and, therefore, fig. 10 refers to the process of retrieving the interleaving information for a set of samples or blocks currently visited.
As mentioned earlier, identification and step 450 may include identification between previously decoded blocks, that is, the casual set of blocks, based on aspects of the neighborhood. For example, neighboring blocks can be named candidates, which include certain neighborhood samples that are neighbors to one or more geometrically predetermined samples of the current block X in space or time. In addition, the identification step can comprise two named phases, a first phase involving an identification as just mentioned, that is, based on the neighborhood, which leads to a preliminary set of candidate blocks, and a second phase according to which only the candidate blocks are named the parameters already transmitted coding that satisfy a certain relationship with the subset
40/77 of one of the current X block encoding parameters, which was already decoded from the data stream before step 450.
Then, the process steps to step 452, where it is determined whether the number of candidate blocks is greater than zero. If this is the case, a Merge flag is extracted from the data stream at step 454. Extraction step 454 may involve decoding entropy. The context for decoding the entropy Merge_flag in step 454 can be determined based on syntax elements belonging to, for example, the set of candidate blocks or the set of preliminary candidate blocks, in which the dependence on the syntax elements can be restricted with the information whether the blocks belonging to the set of interest have been merged or not. The probability estimate of the selected context can be adapted.
If, however, the number of candidate blocks is determined, in turn, to be equal to zero 452, the process of Fig. 10 continues with step 456 in which the encoding parameters of the current block are extracted from the bit stream , or, in the case of the aforementioned two-stage identification alternative, the remaining coding parameters of the same, where then extractors 102 proceed with the processing of the next block of the scan order of the block, such as order 350 shown in fig. 3c.
Returning to step 454, the process continues, after extraction in step 454, with step 458 with a check on whether the extracted Merge_flag suggests the occurrence or absence of a merger of the current block. If there is no merger, the process that proceeds with step 456 above will take place. Otherwise, the process continues with step 460, including checking that the number of candidate blocks is equal to one. If this is the case, the transmission of an indication of a candidate block determined between the candidate blocks was not necessary and, therefore, the process of fig. 10 proceeds with step 462, whereby the melting partner of the flow block is set to be the only candidate block, which then in step 464 the coding parameters of the melting partner block are used for adaptation or prediction of the remaining coding parameters or coding parameters of the current block. In case of adaptation, the coding parameters missing from the current block are merely copied from the merging partner block. In the other case, that is, in the case of prediction, step 464 may include an additional extraction of residual data from the data stream of residual data relating to residual prediction of the missing coding parameters of the current block and a combination of these residual data with the prediction of these missing coding parameters obtained from the fusion partner block.
If, however, the number of candidate blocks is determined to be greater than one in step 460, the process of fig. 10 proceeds to step 466, where a
41/77 verification to see if the encoding parameters or the interesting part of the encoding parameters - named subpart related to the part has not yet been transferred within the data flow to the current block - are identical with each other. If so, these common coding parameters are defined as reference interleaving, or candidate blocks are defined as fusion partners in step 468 and the respective coding parameters are interestingly used for adaptation or prediction in step 464.
It should be noted that the merger partner alone may have been an obstacle for which the merger was signaled. In this case, the parameters adopted or predictively obtained encoding fusion partners are used in step 464.
Otherwise, however, that is, in the case where the coding parameters are not identical, the process of fig. 10 proceeds to step 470, where an additional syntax element is extracted from the data stream, that is, this merge_left_flag. A separate set of contexts can be used to entropy decode this flag. The set of contexts used for entropy decoding the merge_left_flag can also comprise only one context. After step 470, the candidate block indicated by mergejeft flag is defined to be the fusion partner in step 472 and used for adaptation or prediction in step 464. After step 464, extractor 102 proceeds to manipulate the block in the next order the block.
Of course, there are many alternatives. For example, a combined syntax element can be transmitted within the data stream, instead of separating the syntax elements and MergeJIag mergeJeftJlag described earlier, the combined syntax elements signal the merger process. In addition, the aforementioned mergejeftjlag can be transmitted within the data stream, regardless of whether the two candidate blocks have the same prediction parameters or not, thus reducing the computational load for carrying out the process of fig. 10.
As already indicated in relation to, for example, fig. 9b, more than two blocks can be included in the candidate block set. In addition, the fusion information, that is, the signaling information if a block is merged and, if so, that the block is a candidate for incorporation can be signaled by one or more syntax elements. A syntax element can specify whether the block is merged with any of the candidate blocks, such as the MergeJIag described above. The flag can only be transmitted if the set of candidate blocks is not empty. A second element of syntax may signal that one of the candidate blocks is used for merging just like the aforementioned mergejeftjlag, but, in general, indicates a selection between two or more than two candidate blocks. The second element of syntax can be transmitted only if the first element of syntax signals that the current block is to be merged with a
42/77 of the candidate blocks. The second element of syntax can also be transmitted only if the set of candidate blocks contains more than one candidate block and / or, if any of the candidate blocks has different prediction parameters than any other among the candidate blocks. The syntax can be, depending on how many 5 candidate blocks are presented and / or how different prediction parameters are associated with the candidate blocks.
The syntax for signaling which of the blocks among the candidate blocks to be used can be defined simultaneously and / or in parallel, on the side of the encoder and decoder. For example, if there are three options for candidate blocks identified in step 10, 450, the syntax is chosen in such a way that only these three options are available and are considered for entropy coding, for example, in step 470. In other words, the syntax element is chosen in such a way that its symbolic alphabet has simply as many elements as choices of candidate blocks exist. The odds for all other choices can be considered to be zero 15 and entropy encoding / decoding can be adjusted simultaneously on the encoder and decoder.
In addition, as already mentioned in relation to step 464, the prediction parameters that are inferred as a consequence of the fusion process can represent the complete set of prediction parameters that are associated with the current block 20 or can represent a subset of these prediction parameters, such as the prediction parameters for a block hypothesis for which several prediction hypotheses are used.
As noted above, the syntax elements referring to the merger information can be encoded by entropy using context models. The syntax elements can consist of the MergeJIag and the mergeJeftJlag described above (or similar syntax elements). In a concrete example, one of three context models or contexts can be used for encoding / decoding the MergeJIag in step 454, for example. The index used in the mergejlag_ctx context model can be derived as follows: if the set of candidate blocks contains two elements, the value of 30 mergeJlag.ctx is equal to the sum of the MergeJIag values of the two candidate blocks. If the set of candidate blocks contains an element, however, the mergeJlag_ctx value can be equal to twice the MergeJIag value of this candidate block. As each of the neighboring candidate MergeJIag blocks can be either one or zero, the three contexts are available for MergeJIag. The mergejeftjlag can be coded 35 using only a single probabilistic model.
However, according to an alternative embodiment, different context models can be used. For example, non-binary syntax elements
43/77 can be mapped to a sequence of binary symbols, called compartments. The context models for some syntax elements or syntax element compartments that define the merger information can be obtained based on syntax elements already transmitted from neighboring blocks or the number of candidate blocks or other measures, while other elements syntax or compartments of syntax elements can be encoded with a fixed context template.
In relation to the above description of the merger of blocks, it is verified that the set of candidate blocks can also be derived in the same way as for any of the modalities described above, with the following modification: Candidate blocks are restricted to blocks with prediction with compensation of movement or interpretation, respectively. Only these can be elements of a set of candidate blocks ’Signaling and modeling the context of the merger information can be performed as described above.
Returning to the combination of the realization modalities described above, the multitree subdivision and the merging aspect described now, if the image is divided into square blocks of variable size using a quadtree subdivision structure based, for example, on the MergeJIag and mergejeftjlag or other syntax elements specifying the merger can be merged with the prediction parameters that are passed to each leaf node of the quadtree structure. Consider again, for example, fig. 9a. Fig. 9 shows an example of a quadtree subdivision based on an image in predictive blocks of varying size. The first two largest blocks are called tree blocks, that is, they are the prediction blocks of the maximum possible size. The other blocks in this figure are obtained as a subdivision of their corresponding tree block. The current block is marked with an X ”. All shaded blocks are co / decoded before the current block, and form the set of causal blocks. As explained in the description of the derivation of the set of candidate blocks for one of the realization modalities, only the blocks that contain the samples directly neighboring (that is, upper or left) of the position of the upper left sample of the current block can be members of the set of candidate blocks. Thus, the current block can be merged with any block A or block B. If the MergeJIag is equal to 0 (for false), the current block X is not merged with either of the two blocks. If blocks A and “B have identical prediction parameters, no distinction needs to be made, since merging with either of the two blocks will cause the same result. Thus, in this case, the mergejeftjlag is not transmitted. Otherwise, if blocks A and B have different prediction parameters, mergejeftjlag equal to 1 (for true) will merge blocks X and B, while mergejeftjlag equal to 0 (for false)
44/77 will merge blocks X and A ”. In another preferred embodiment, additional neighboring blocks (already transmitted) represent candidates for the merger.
In fig. 9b, another example is shown. Here, the flow of block X, and the left neighboring block B are tree blocks, that is, they have the maximum allowed block size.
The size of the upper neighboring block A is one-quarter the size of the tree block. The blocks that are the element of the set of causal blocks are shaded. Note that according to a preferred embodiment, the current block X can only be merged with the two blocks A or B, and not with any of the other main neighboring blocks. In the other preferred embodiment, other neighboring blocks (already transmitted) represent candidates for the merger.
Before proceeding with the description in relation to the aspect of how to deal with different sample arrangements of an image, according to the embodiments of the present patent application, it should be noted that the above discussion on multitree subdivision and signaling by on the one hand and the fusion aspect on the other hand it is clear that these 15 aspects provide advantages that can be exploited independently of each other. That is, as explained above, the combination of a multitree subdivision with fusion has specific advantages, but these advantages also result from alternatives that, for example, the fusion feature is incorporated with, however, the subdivision made by subdivisers 30 and 104 not based on a quadtree or multitree subdivision, but 20 corresponds to a subdivision of the macroblock with the regular partitioning of these macroblocks into smaller parts. On the other hand, in turn, the combination of the multitree subdivision together with the transmission of the maximum tree block size within the bit stream, and the use of the multitree subdivision together with the use of the transversal depth order carry the coding of the parameters of the corresponding blocks 25 is advantageous regardless of the melting characteristic to be used concurrently or not. In general, the advantages of fusion can be understood, when it is considered that, intuitively, the coding efficiency can be increased when the syntax of sample arrangement codes is extended in order to not only allow the subdivision of a block, but also to merge two or more blocks that are obtained after 30 subdivision. As a result, a group of blocks is obtained that are coded with the same prediction parameters. The prediction parameters for a group of blocks must be coded once. In addition, in relation to the fusion of sample sets, it should again be noted that the sample sets considered may be rectangular or square blocks, in which case the fused sample sets represent a collection of rectangular and / or square blocks. Alternatively, however, the sample sets considered are arbitrarily image-shaped regions and the interleaved sets of samples representing a set of regions in the form of
45/77 image arbitrarily.
The following description focuses on manipulating sample arrangements other than an image if there is more than one sample per image arrangement and, in some respects described in the following sub-description, they are advantageous regardless of the type of subdivision used, ie is, regardless of the subdivision being based on multitree subdivision or not, and regardless of the merger being used or not. Before starting with the description of specific modalities related to the processing of different sample arrangements of an image, the main problem of these modalities is motivated through a brief introduction in the field of handling different sample arrangements per image.
The following discussion focuses on encoding parameters between the blocks of different sample arrangements of an image in the image, or a video encoding application, and in particular, a way to adaptively predict the encoding parameters between the arrangements samples other than an image, for example, but not exclusively the encoder and decoder of Figs. 1 and 2, respectively, or other video image or encoding environment. The sample arrangements can, as noted above, represent sample arrangements that are related to different color components or sample arrangements that are associated with an image with additional information, such as transparency data or depth maps. Sample arrangements that are related to the color components of an image are also referred to as color planes. The technique described below is also referred to as the adoption of the interplane / prediction and can be used in image block based and video encoders and decoders, in which the processing order of the sample arrangement blocks for an image can be arbitrary.
Image and video encoders are usually designed for encoding color images (either from still images or images from a video stream). A color image consists of multiple color planes, which represent sample arrangements for different color components. Color images are often encoded as a set of sample arrangements consisting of a luminance plane and two chrominance planes, where the latter specify color difference components. In some application domains, it is also common for the set of coded sample arrangements to consist of three color planes that represent the sample arrangements for the three primary colors: red, green and blue. In addition, for an improved color representation, a color image can consist of more than three color planes. In addition, an image can be associated with auxiliary sample arrangements that specify additional information for the image. For example, these auxiliary sample arrangements may be sample arrangements that specify the
46/77 transparency (suitable for specific display purposes) for sample-associated color arrangements or sample arrangements that specify a depth map (suitable for rendering multiple views, for example, for 3-D displays).
In conventional image and video coding standards (such as H.264), the color planes are generally coded together, in which certain coding parameters, such as macroblock and sub-macroblock modes of prediction, reference indexes, and motion vectors are used for all color components in a block. The luminance plane can be considered as the primary color plane for which the specific encoding parameters are specified in the bit stream, and the chrominance planes can be considered as secondary planes, for which the corresponding encoding parameters are inferred. from the main luminance plane. Each luminance block is associated with two chrominance blocks that represent the same area of an image. Depending on the chrominance sampling format used, the chrominance sample arrays may be smaller than the luminance array sample for a block. For each macroblock, which consists of a luminance and two chrominance components, the same partitioning into smaller blocks is used (if the macroblock is subdivided). For each block composed of a luminance sample block and two chrominance sample blocks (which can be the macroblock itself or a subblock of the macroblock), the same set of prediction parameters, such as the reference indices, motion parameters and intra-prediction modes are sometimes employed. In the specific profiles of conventional video coding standards (such as the 04:04:04 profiles in H.264), it is also possible to encode the different color planes of an independent image. In this configuration, the macroblock partitioning, prediction modes, reference indexes, and motion parameters can be separately chosen for a color component of a macroblock or sub-block. Conventional coding standards for all color planes are coded together, using the same set of specific coding parameters (such as subdivision information and prediction parameters) or all color planes are coded completely independently of each other .
If the color planes are coded together, a set of subdivision prediction parameters must be used for all color components in a block. This ensures that side information is kept small, but can result in a reduction in coding efficiency compared to independent coding, since the use of different block decompositions and different color component prediction parameters can result in a lower cost. distortion rate. As an example, using a different motion vector or reference image for chrominance components can significantly reduce energy
47/77 of the residual signal to the chrominance components and increase its overall coding efficiency. If the color planes are coded independently, the coding parameters, such as the partitioning block, reference indices, and motion parameters can be selected for each color component separately in order to optimize the coding efficiency for each color component. But it is not possible to employ redundancy between color components. Multiple transmissions of certain encoding parameters result in an increase in the side information rate (compared to the combined encoding) and this rate of increase in side information can have a negative impact on the overall coding efficiency. In addition, support for auxiliary sample arrangements in the state of the art video encoding standards (such as H.264) is restricted to the case where the auxiliary sample arrangements are encoded using their own set of encoding parameters.
Thus, in all embodiments described so far, image plans can be resolved as described above, but also as discussed above, the overall coding efficiency for coding multiple sample arrangements (which can be related to the plans different colors and / or samples of auxiliary arrangements) can be increased, when it would be possible to decide based on blocks, for example, whether all sample arrangements for a block are coded with the same coding parameters or if different parameters of encoding are used. The basic idea of the next inter-plane prediction is to allow for an adaptive decision based on blocks, for example. The encoder can choose, for example, based on the distortion rate criterion, whether all or some of the sample sets in a given block are encoded using the same encoding parameters or whether different encoding parameters are used for different sample arrangements . This selection can also be achieved through signaling for a given block in a sample arrangement if the specific coding parameters are inferred from a coded block already co-located from a different sample arrangement. It is also possible to organize different sample arrangements to obtain an image in groups, which are also referred to as arrangement sample groups or plan groups. Each plan group can contain one or more sample arrangements of an image. Then, the blocks of the sample arrangements within a plane group share the same selected coding parameters, such as subdivision information, prediction modes, and residual coding modes, while other coding parameters, such as levels transformation coefficients are transmitted separately for each sample set within the plan group. A plan group is coded as a primary plan group, that is, none of the coding parameters are inferred or predicted from
48/77 from other plan groups. For each block of a secondary plan group, it can be adaptively chosen if a new set of selected coding parameters is transmitted, or if the selected coding parameters are inferred or predicted from the primary plan group or another group of secondary planop. Decisions about whether certain encoding parameters for a given block are inferred or predictable are included in the bit stream. Interplanar prediction allows greater freedom in choosing the exchange between the rate of lateral information and quality of prediction in relation to the coding of the state of the art of images that consist of multiple sample arrangements. The advantage is greater coding efficiency over conventional image coding which consists of multiple sample arrangements.
Intra-plane adoption / prediction can extend an image or video encoder, such as those of the previous embodiment, so that it can be chosen adaptively for a block of a color sample arrangement, or an arrangement auxiliary sample or a set of color sample arrangements and / or auxiliary sample arrangements if a set of selected coding parameters is inferred or predicted from the co-located blocks already coded from other sample arrangements in the same image, or if the selected set of coding parameters for the block is independently coded without reference to the co-located blocks of arrangements from other samples of the same image. Decisions whether the selected set of coding parameters are inferred or predicted for a block of a sample array or a block of multiple sample sets can be included in the bit stream. The different sample arrangements that are associated with an image do not have to be the same size.
As described above, the sample arrangements that are associated with an image (the sample sets can represent the color components and / or auxiliary sample arrangements) can be arranged in two or more of the so-called plane groups, where each plan group consists of one or more sample arrangements. The sample arrangements that are contained in a particular plan group, need not be the same size. Note that this arrangement in a plan group includes the case where each sample arrangement is coded separately.
To be more precise, according to an embodiment, it is adaptively chosen, for each block in a plan group, whether the coding parameters that specify how a block is predicted are inferred or predicted from a block -located already from a different plane group for the same image or if these encoding parameters are coded separately for the block. The encoding parameters that specify how a block is envisaged include one or more of the following encoding parameters: block prediction modes specifying which
49/77 prediction is used for the block (intra prediction, the inter prediction using the single motion vector and the reference image, the inter prediction using two motion vectors and reference images, the inter prediction using a higher order, that is, the non-translational motion model and a single reference image, the inter prediction using multiple motion models and reference images), intra prediction modes specify how an intra prediction signal is generated, an identifier that specifies how various prediction signals are combined to generate the final prediction signal for the block, the reference indexes specifying which reference image (s) is / are used for motion compensation, parameter prediction motion (such as displacement vectors or related motion parameters) specifying how the prediction signal (s) is / are generated using the reference image reference (s), an identifier that specifies how the reference image (s) is / are filtered to generate compensated prediction motion signals. Note that, in general, a block can be associated with only a subset of the mentioned encoding parameters. For example, if the block prediction mode specifies that a block is intra predicted, the coding parameters for a block may additionally include the intra prediction modes, but the coding parameters, such as reference indexes and motion parameters that specify how an inter-prediction signal is generated are not specified, or, if the block prediction mode specifies the inter-prediction, the associated encoding parameters may additionally include the reference indices and motion parameters, but the modes of intra prediction are not specified.
One of the two or more plane groups can be encoded or indicated within the bit stream as the primary plane group. For all blocks in that main plane group, the encoding parameters that specify how the prediction signal is generated are transmitted, without referring to other plane groups in the same image. The remaining plan groups are coded as secondary plan groups. For each block in one of the secondary plane groups, one or more syntax elements are transmitted that signal whether the coding parameters to specify how the block is predicted are inferred or predicted from a co-located block of groups from other planes or whether a new set of these encoding parameters is transmitted to the block. One of one or more elements of syntax can be referred to as inter-plan prediction flag or inter-plan prediction parameter. If the syntax elements signal that the corresponding encoding parameters are not inferred or predicted, a new set of encoding parameters for the corresponding blocks are transmitted in the bit stream. If the syntax elements signal that the corresponding coding parameters are inferred or predicted, the block co-located in a so-called
50/77 reference plane group is determined. The reference plane group assignment for the block can be configured in several ways. In one embodiment, a group of particular reference plans is assigned to each second group of plan; this assignment can be fixed or can be signaled in structures with a high level of syntax, such as parameter sets, access unit header, image header, or section header.
In a second embodiment, the assignment of the reference plane group is encoded within the bit stream and marked by one or more syntax elements that are encoded for a block, in order to specify whether the selected encoding parameters are inferred or or separately coded.
In order to facilitate the recently mentioned possibilities, in connection with the inter-plane prediction and the following detailed embodiments, reference is made to fig. 11, which shows an illustrative image 500 composed of three sample arrangements 502, 504 and 506. For reasons of easier understanding, simple sub-portions of sample arrangements 502-506 are shown in Fig. 11. The sample arrangements are shown as if data were recorded against each other in space, so that sample sets 502-506 overlap each other along a direction 508 and so that the projection of samples from sample arrangements 502-506 together direction results 508 in the samples from all of these sample arrangements 502-506 to be correctly spatially located for each other. In other words, the 502 and 506 planes were spread along the horizontal and vertical direction, in order to adapt their spatial resolution to each other and register them with each other.
According to an embodiment, all sample arrangements of an image belong to the same portion of a scene in which the spatial resolution along the vertical and horizontal direction may differ between the individual sample arrangements 502-506. In addition, for the purposes of illustration, sample arrangements 502 and 504 are considered to belong to a plane group 510, while arrangement sample 506 is considered to belong to another group of plane 512. In addition, fig. Figure 11 illustrates the exemplary case where the spatial resolution along the horizontal axis of the sample array 504 is twice that of the resolution in the horizontal direction of the sample array 502. In addition, the sample array 504 is considered to form the array primary with respect to the sample arrangement 502, which forms an arrangement with respect to the subordinate primary arrangement 504. As explained earlier, in this case, the sample arrangement subdivision 504 into blocks as decided by subdivisor 30 of fig. 1 is adopted by which the subordinate arrangement 502, according to the example of fig. 11, due to the vertical resolution of the sample arrangement 502 with half the resolution in the vertical direction of the main arrangement 504, each block was halved in two blocks horizontally
51/77 juxtapositions, which, due to halving, are square blocks, when measured in units of the sample positions within the sample array 502.
As is exemplarily shown in fig. 11, the subdivision chosen for the arrangement sample 506 is different from the subdivision of the group of another plane 510. As described above, subdivisor 30 may select the subdivision of pixel arrangement 506 separately or independent of the subdivision for the group of plan 510. Of course, the sample resolution of array 506 may also differ from the resolutions of plan 502 and 504 of plan group 510.
Now, when coding the individual sample arrangements 502-506, the encoder 10 can begin with the coding of the main arrangement 504 of the plan group 510, for example, in the manner described above. The blocks shown in fig. 11 can, for example, be the aforementioned prediction blocks. Alternatively, blocks are residual blocks or other blocks that define granularity to define certain encoding parameters. The prediction of the interplane is not restricted to quadtree or multitree subdivision, although this is illustrated in fig. 11.
After transmission of the syntax element to the primary arrangement 504, the encoder 10 may decide to declare the primary arrangement 504 to be the reference plane for the subordinate plane 502. The encoder 10 and an extractor 30, respectively, can signal this decision through of the bit stream 22, while the association may be evident from the fact that the sample arrangement 504 makes the arrangement of the main plane group 510 in which the information, in turn, can also be part of the bit sequence 22. In any case, for each sample block 502 within the insertion of the arrangement 18 or any other coding module 10, together with the insertion device 18, it may decide to suppress the transfer of the coding parameters of this block within the bit stream and signaling within the bit stream for that block instead of the coding parameters of a block co-located within the main array 504 to be used instead, or the coding parameters of the co-located block within the main array 504 which should be used as a prediction for the coding parameters of the current 502 sample array lock simply by transferring the residual data from it to the current block of the sample array 502 within the bit stream. In the event of a negative decision, the encoding parameters are transferred within the data stream, as usual. The decision is signaled within the data stream 22 for each block. On the decoder side, extractor 102 uses this interplan prediction information for each block, in order to obtain the coding parameters of the respective block of the sample arrangement 502 accordingly, namely, by inference of the coding parameters of the co-block. located in the main array 504 or, alternatively, extracting residual data for that block from the data stream and combining that data
52/77 residuals with a prediction obtained from the codification parameters of the colocalized block of the primary arrangement 504, if the inter-plan approval / prediction information suggests the adoption / prediction of the inter-plane, or extract the coding parameters of the inter-plane. current block of sample array 502, as usual independent of main array 504.
As previously described, reference plans are not restricted to reside within the same plane group as the block plan for which inter-prediction is currently of interest. Thus, as described above, the group plane 510 may represent the primary plane group or reference plane group for the secondary plane group 512. In this case, the bit stream may contain a syntax element indicating, for each block of arrangement samples 506, such as the adoption / prediction of the aforementioned coding parameters of co-located macroblocks of any of the planes 502 and 504 of the primary plan group or of the reference plan group 510 must be performed or not, in that in the latter case, the coding parameters of the current block of the sample arrangement 506 are transmitted normally.
It should be noted that the subdivision and / or prediction parameters for the planes within a plan group can be the same, that is, because they are only coded once per plan group (all secondary plans in a group of the plan infer the information of the subdivision and / or of the prediction parameters from the main plane within the same plane group), and the adaptive prediction or infer the information of the subdivision and / or the prediction parameters that is made between the groups plan.
It should be noted that the reference plane group can be a primary plane group or a secondary plane group.
The co-location between the blocks of different planes within a plan group is easily understandable where the subdivision of the primary sample arrangement 504 is spatially adapted by the subordinate sample arrangement 502, except for the newly described sub-partitioning of the blocks, in order to make the adopted leaf blocks into square blocks. In case of inter-plan adoption / prediction between different plan groups, co-location can be defined in a way, in order to allow greater freedom between the subdivisions of these plan groups. Given the reference plane group, the co-located block within the reference plane group is determined. The derivation of the co-located block and the reference plane group can be done by a process similar to the following. Private sample 514 in the current block 516 of a sample of arrangements 506 of the secondary plane group 512 is selected. The same can be said for the upper left sample of the current block 516, as shown in 514 in fig. 11 for illustrative purposes, or from a sample in the next current block 516 to the middle of the current block 516 or any other sample within the current block, which is uniquely defined geometrically. The location of this sample selected within an array 515 of samples 502 and 504 from
53/77 reference plane are calculated by group 510. The positions of sample 514 within sample arrangements 502 and 504 are shown in fig. 11 in 518 and 520, respectively. Among planes 502 and 504 within the reference plane group 510 it is actually used and can be determined, or can be assigned within the bit stream. The sample within the corresponding sample arrangement 502 or 504 of the reference plane group 510, the closest to positions 518 and 520, respectively, is determined and the block containing this sample is chosen as the co-located block within the respective sample arrangement 502 and 504, respectively. In the case of fig. 11, these are blocks 522 and 524, respectively. An alternative approach for determining the co-located block in other planes is described later.
In one embodiment, the coding parameters specifying the prediction for the current block 516 are completely inferred using the corresponding prediction parameters for the co-located block 522/524 in a different plane from group 510 of the same image 500, without transmitting additional side information . The inference can consist of a simple copy of the respective coding parameters or an adaptation of the coding parameters taking into account the differences between the current 512 and the reference plane of the group 510. As an example, this adaptation can consist of adding a motion correction parameter (for example, a displacement correction vector) to take into account the phase difference between the luminance and chrominance sample arrangements, or the adaptation may consist of modifying the accuracy of the movement parameters (for example , modifying the precision of the displacement vectors) to take into account the different resolution of luminance and chrominance sample arrangements. In another embodiment, one or more of the inferred coding parameters to specify the generation of the prediction signal are not directly used for the current block 516, but are used as a prediction for the respective coding parameters for the current block 516 and a refinement of these coding parameters for the current block 516 is transmitted in bit stream 22. As an example, the inferred motion parameters are not used directly, but the parameter movement differences (such as a difference in displacement vectors) which specify the deviation between the motion parameters that are used for the current block 516 and the inferred motion parameters are encoded in the bit stream; on the side of the decoder, the current motion parameters used are obtained by combining the inferred motion parameters and the differences in the transmitted motion parameters.
In another embodiment, the subdivision of a block, such as the tree blocks of the aforementioned prediction subdivision into prediction blocks (that is, sample blocks for which the same set of prediction parameters is used) is used.
54/77 adaptively inferred or predicted from a co-located block already encoded in a different plane group for the same image, that is, the bit sequence according to fig. 6a or 6b. In one embodiment, one of the two or more plan groups is coded as the main plan group. For all blocks in that main plane group, the subdivision information is transmitted, without referring to other plane groups in the same image. The remaining plan groups are coded as secondary plan groups. For blocks in groups of secondary planes, one or more elements of syntax are transmitted in such a way that the signal of the subdivision information is inferred or predicted from other blocks co-located in plan groups or if the subdivision information is transmitted in the bit stream. One of the one or more syntax elements can be referred to as an interplanet prediction flag or an interplanet prediction parameter. If the syntax elements signal that the subdivision information is not inferred or predicted, the information for the subdivision of the block is transmitted in the bit stream without referring to other plane groups of the same image. If the syntax elements signal that the subdivision information is inferred or predicted, the block co-located in a so-called reference plane group is determined. The reference plane group assignment for the block can be configured in several ways. In one embodiment, a particular reference plan group is assigned to each secondary plan group; this assignment can be fixed or can be signaled in structures with a high level of syntax, such as parameter sets, access unit header, image header, or section header. In a second embodiment, the assignment of the reference plane group is encoded within the bit stream and marked by one or more syntax elements that are encoded for a block, in order to specify whether the subdivision information is inferred or predicted or separately coded. The reference plane group can be the primary plane group or another secondary plane group. Given the reference plane group, the block co-located within the reference plane group is determined. The co-located block is the block in the reference plane group that corresponds to the same image area as the current block, or the block that represents the block within the reference plane group, which shares most of the image area with the block acts. The co-located block can be divided into smaller prediction blocks.
In another embodiment, the subdivision information for the current block, such as the quadtree subdivision based on Figs. 6a or 6b, is completely inferred based on the subdivision information of the co-located block in a different plane from the group of the same image, without transmitting the additional lateral information. As a particular example, if the co-located block is divided into two or four prediction blocks, the current block is also divided into two or four sub-blocks,
55/77 for prediction purposes. As another particular example, if the co-located block is divided into four sub-blocks and one of these sub-blocks is further divided into four smaller sub-blocks, the current block is also divided into four sub-blocks and one of these sub-blocks -blocks (which corresponds to the sub-block of the co-located block, which is still decomposing) is also divided into four smaller sub-blocks. In an even more preferred embodiment, the inferred subdivision information is not used directly for the current block, but is used as a prediction for the effective subdivision information for the current block, and the corresponding refinement information is transmitted in the flow of information. bits. As an example, the subdivision information that is inferred from the co-located block can be improved. For each sub-block corresponding to a sub-block of the co-located block, which is not divided into smaller blocks, a syntax element can be encoded in the bit stream, which specifies whether the sub-block is further decomposed into a group current plan. The transmission of such a syntax element can be conditioned to the size of the sub-block. Or it can be a bit-stream signal in which a sub-block that is further divided into the reference plane group is not divided into smaller blocks of the current plane group.
In another embodiment, both the subdivision of a block into blocks of prediction and the coding parameters that specify how sub-blocks are predicted are adaptively inferred or predicted from a coded block already co-located from a different plane group. for the same image. In a preferred embodiment of the invention, one of the two or more plan groups is coded as a primary plan group. For all blocks in that primary plane group, the subdivision information and prediction parameters are transmitted, without referring to other plane groups in the same image. The remaining plan groups are coded as secondary plan groups. For blocks of secondary plan groups, one or more elements of syntax that are transmitted signal whether the subdivision information and the prediction parameters are inferred or predicted from a co-located block of other plan groups or if the information of the subdivision and the prediction parameters are transmitted in the bit stream. One of the one or more syntax elements can be referred to as the interplanet prediction flag or interplanet prediction parameter. If the syntax elements signal that the subdivision information and the prediction parameters are not inferred or predicted, the subdivision information for the block, and the prediction parameters for the resulting sub-blocks are transmitted in the bit stream without referring to to other plane groups of the same image. If the syntax elements signal that the subdivision information and prediction parameters for the sub-block are inferred or predicted, the block co-located in a so-called reference plane group is determined. The assignment of the reference plane group to the block can
56/77 be configured in several ways. In one embodiment, a group of particular reference plan is assigned to each group of secondary plan; this assignment can be fixed or can be signaled in structures with a high level of syntax, such as parameter sets, access unit header, image header, or section header. In a second embodiment, the assignment of the reference plane group is encoded within the bit stream and marked by one or more syntax elements that are encoded for a block, in order to specify whether the subdivision information and the parameters of prediction are inferred or predicted or separately coded. The reference plane group can be the primary plane group or another secondary plane group. Given the reference plane group, the block co-located within the reference plane group is determined. The co-located block can be the block in the reference plane group that corresponds to the same image area as the flow block, or the block that represents the block within the reference plane group, which shares most of the reference area. image with the current block. The co-located block can be divided into smaller prediction blocks. In a preferred embodiment, the subdivision information for the current block, as well as the prediction parameters for the resulting sub-blocks, are completely inferred based on the subdivision information of the co-located block in a different plane group of the same image and the prediction parameters of the corresponding sub-blocks, without transmitting additional collateral information. As a particular example, if the co-located block is divided into two or four prediction blocks, the current block is also divided into two or four sub-blocks for the purpose of prediction and the prediction parameters for the sub-blocks from the current block are derived, as described above. As another particular example, if the co-located block is divided into four sub-blocks and one of these sub-blocks is further divided into four smaller sub-blocks, the current block is also divided into four sub-blocks and one of these sub-blocks (which corresponds to the sub-blocks of the co-located block, which is still decomposing) is also divided into four smaller sub-blocks and the prediction parameters for all sub-blocks that are no longer partitioned are inferred as described above. In an even more preferred embodiment, the information is completely subdivided inferred based on the subdivision information of the block co-located in the reference plane group, but the predicted parameters for the sub-blocks are used only as a forecast for the real prediction parameters of the sub-blocks. The deviations between the actual prediction parameters and the inferred prediction parameters are encoded in the bit stream. In another embodiment, the inferred subdivision information is used as a prediction for the actual subdivision information of the flow block and that the difference is transmitted in the bit stream (as described above), but the prediction parameters are
57/77 completely inferred. In another embodiment, both the inferred subdivision information and the inferred prediction parameters are used as the prediction and the differences between the effective subdivision information and the predicted parameters and their inferred values are transmitted in the bit stream.
In another embodiment, it is adaptively chosen, for a block of a plan group, if the residual coding modes (such as the type of transformation) are inferred from a predicted or already coded block co-located in a different flat group for the same image or if the residual coding modes are coded separately for the block. This realization modality is similar to the realization modality for adaptive inference / prediction of the prediction parameters described above.
In another embodiment, the subdivision of a block (for example, a prediction block) into transformation blocks (that is, blocks of samples to which the two-dimensional transformation is applied) is adaptively inferred or predicted from a coded bit already co-located from a different plane group for the same image. This realization modality is similar to the realization modality for adaptive inference / prediction of the subdivision into prediction blocks described above.
In another embodiment, the subdivision of a block to transform the blocks and the residual coding modes (for example, types of transformation) for the resulting transformation blocks are adaptively inferred or predicted from an already co-located coded block of a different plan group for the same image. This realization mode is similar to the realization mode for adaptive inference / prediction of the subdivision into prediction blocks and the prediction parameters for the resulting prediction blocks described above.
In another embodiment, the subdivision of a block into prediction blocks, the associated prediction parameters, the subdivision information of the prediction blocks, and the coding modes for the transformation block residues are adaptively inferred or predicted from an encoded block already co-located from a different plane group for the same image. This embodiment represents a combination of the embodiments described above. It is also possible that only some of the inferred or predicted coding parameters are mentioned.
Thus, the adoption / prediction of the inter-plan can increase the coding efficiency described previously. However, the gain in coding efficiency through the adoption / prediction plan is also available in case of other block subdivisions to be used than multitree based on subdivisions and independent of the merger block to be implemented or not.
58/77
The realization modalities outlined above in relation to the adaptation / inter prediction plane are applicable to the video image and the encoders and decoders that divide the color planes of an image and, if present, the auxiliary sample arrangements associated with an image in blocks and associated with these blocks of encoding parameters. For each block, a set of encoding parameters can be included in the bit stream. For example, these encoding parameters can be parameters that describe how a block is predicted or decoded on the side of the decoder. As particular examples, the coding parameters can represent modes of macroblocks or blocks of prediction, subdivision of information, intra prediction modes, reference indexes used for prediction of compensated motion parameters, of motion, such as displacement vectors, coding modes residuals, transformation coefficients, etc. The different sample arrangements that are associated with an image can be of different sizes.
Next, an increased signaling system of coding parameters within a tree-based partitioning scheme, such as, for example, those described above with reference to fig. 1-8 is described. As with the other systems, ie the merger and international adoption / prediction plan, the effects and advantages of advanced signaling systems, hereinafter often called inheritance, are described independently of the above embodiments, although the schemes described below are combined with any of the previous embodiments, either alone or in combination.
Generally, the coding scheme for better coding of lateral information within a tree-based partitioning scheme, called inheritance, described below allows the following advantages over conventional coding parameter treatment systems.
In conventional image and video encoding, the images or particular sets of sample arrangements for the images are generally broken down into blocks, which are associated with certain encoding parameters. The images usually consist of multiple sample arrangements. In addition, an image can also be associated with other auxiliary sample arrangements, which can, for example, specify transparency information or depth maps. Sample arrangements for an image (including auxiliary sample arrangements) can be grouped into one or more of the so-called plan groups, where each plan group consists of one or more sample arrangements. The plane groups of an image can be coded independently, or, if the image is associated with more than one plane group, with the prediction of other plane groups of the same image. Each plan group is usually broken down into blocks. The blocks (or blocks of sample arrangements
59/77 correspondents), are predicted by an inter-image prediction or intra-image prediction. The blocks can have different sizes, and can be square or rectangular. The partitioning of an image in blocks can be fixed by syntax, or it can be (at least partially) signaled within the bit stream. Often, the syntax elements are transmitted signaling the subdivision for blocks of predefined sizes. Such syntax elements can specify whether and how a block is subdivided into smaller blocks and are associated with encoding parameters, for example, for the purpose of prediction. For all samples in a block (or the corresponding sample arrangement blocks) the decoding of the associated encoding parameters is specified in a determined manner. In the example, all samples in a block are predicted using the same set of prediction parameters, such as reference indexes (the identification of a reference image in the already encoded image set), the movement parameters (specifying a measurement for the movement of a block between a reference image and the current image), parameters for specifying filter interpolation, intra prediction modes, etc. The motion parameters can be represented by displacement vectors with a horizontal and vertical component, or by means of higher order parameters of motion, such as the related motion parameters consist of six components. It is also possible that more than a set of particular prediction parameters (such as benchmarks and motion parameters) are associated with a single block. In this case, for each set of these prediction parameters determined, a single intermediate prediction signal for the block (or the corresponding sample arrangement blocks) is generated, and the final prediction signal is constructed by a combination including the overlapping of the signals. intermediate prediction methods. The corresponding weight parameters and possibly also a displacement constant (which is added to the weight sum) can be fixed to an image, or a reference image, or a set of reference images, or can be included in the parameter set prediction for the corresponding block. The difference between the original blocks (or the corresponding sample arrangement blocks) and their prediction signals, also referred to as the residual signal, is usually transformed and quantized. Often, a two-dimensional transformation is applied to the residual signal (or the corresponding sample arrangements of the residual block). To transform the coding, the blocks (or the corresponding sample arrangement blocks), for which a certain set of prediction parameters have been used, can be further divided, before applying the transformation. The transformation blocks can be equal to or less than the blocks that are used for the prediction. It is also possible for a transformation block to include more than one of the blocks that are used for the prediction. Different blocks of
60/77 transformation can have different sizes and the transformation blocks can represent square or rectangular blocks. After transformation, the resulting transformation coefficients are quantified and the so-called levels of transformation coefficients are obtained. The levels of transformation coefficients, as well as the prediction parameters, and, if present, the subdivision information is encoded by entropy.
In some image and video coding standards, the possibilities for subdividing an image (or a plane group) into blocks that are provided by the syntax are very limited. Typically, it can only be specified if, and (potentially as) a block of a predefined size can be subdivided into smaller blocks. As an example, the size of the largest H.264 block is 16x16. 16x16 blocks are also referred to as macroblocks and each image is divided into macroblocks in a first step. For each 16x16 macroblock, it can be flagged if it is coded as a 16x16 block, or as two 16x8 blocks, or as two 8x16 blocks, or as four 8x8 blocks. If a 16x16 block is subdivided into four 8x8 blocks, each of these 8x8 blocks can be coded as an 8x8 block, or as two 8x4 blocks, or as two 4x8 blocks, or as four 4x4 blocks. The reduced set of possibilities for specifying block partitioning in state of the art image and video coding techniques has the advantage that the lateral information rate for signaling subdivision information can be kept small, but has the disadvantage of that the bit rate needed to transmit the prediction parameters to the blocks can become significant, as explained below. The lateral information rate for signaling the prediction information does not generally represent a significant amount of the total bit rate of a block. And the coding efficiency can be increased when said side information is reduced, which, for example, can be achieved through larger block sizes. Actual images or images in a video sequence consist of objects in an arbitrary manner with specific properties. As an example, objects or parts of objects are characterized by a unique texture or a unique movement. And, generally, the same set of prediction parameters can be applied to such an object, or part of an object. But the object boundaries generally do not match the possible block boundaries for large prediction blocks (for example, 16x16 macroblocks in H.264). An encoder generally determines the subdivision (among the limited set of possibilities) that results in at least one measure of cost in particular of rate of distortion. For objects in arbitrary form, this can result in a large number of small blocks. And since each of these small blocks is associated with a set of prediction parameters, it needs to be transmitted, the rate of lateral information
61/77 can become a significant part of the total bit rate. But, since many of the small blocks still represent areas of the same object or part of an object, the prediction parameters for a given number of blocks obtained are the same or very similar. Intuitively, the coding efficiency can be increased when the syntax is extended so that it not only allows the subdivision of a block, but also shares the coding parameters between the blocks that are obtained after the subdivision. In a tree-based subdivision, the sharing of encoding parameters for a given set of blocks can be achieved by assigning the encoding parameters or parts of one or more parent nodes in the tree-based hierarchy. As a result, the shared parameters, or parts of these, can be used to reduce the lateral information that is needed to signal the actual choice of coding parameters for the blocks obtained after subdivision. The reduction can be obtained by omitting the signaling of parameters for the subsequent blocks or using the common parameter (s) for the prediction and / or modeling of the context of the parameters for the subsequent blocks.
The basic idea of the inheritance scheme described below is to reduce the bit rate that is necessary for transmitting the encoding parameters and sharing the information along the block-based tree hierarchy. The shared information is signaled within the bit stream (in addition to the subdivision information). The advantage of the inheritance scheme is an increase in coding efficiency resulting from a decrease in the lateral information rate for the coding parameters.
In order to reduce the lateral information rate, according to the embodiments described below, the respective coding parameters for a particular set of samples, that is, simply connected regions, which can represent rectangular or square blocks or regions formed arbitrarily or any other collection of samples, from a multitree subdivision are efficiently flagged in the data stream. The inheritance scheme described below allows that the encoding parameters do not need to be explicitly included in the bit stream for each of these sample sets in full. The coding parameters can represent prediction parameters, which specify how the corresponding sample set is predicted using already coded samples. Many possibilities and examples have been described above and also apply here. As also indicated above, and will be described later, as the following inheritance scheme is related, the tree-based partition of the sample sets of an image into sample sets can be fixed by syntax or can be marked by corresponding subdivision information within the bit stream. The coding parameters for the sample sets can, as described above, be transmitted in a pre-defined order, which is given by the syntax.
62/77
According to the inheritance scheme, the decoder or extractor 102 of the decoder is configured to derive information about the encoding parameters of the simply connected individual region or sets of samples in a specific way. In particular, the coding parameters, or parts thereof, such as the parameters that serve the purpose of prediction, are shared between the blocks along the given tree-based partitioning scheme with the sharing group along the tree structure to be decided by the encoder or inserter 18, respectively. In a particular embodiment, sharing the coding parameters for all descendant nodes of a given internal node in the partitioning tree is indicated by means of a specific sharing flag with a binary value. As an alternative approach, refinements of the coding parameters can be passed to each node in such a way that the accumulated parameter refinements across the block tree-based hierarchy can be applied to all sample sets in the block to a leaf node provided. In another embodiment, the parts of the coding parameters that are transmitted to internal nodes along the hierarchy based on the block tree can be used for entropy coding of the adaptive context and decoding of the coding parameter or parts of the block to a supplied leaf node.
Figs 12a and 12b illustrate the basic idea of inheritance for the specific case of using quadtree based partitioning. However, as indicated above, other multitree subdivision systems can often be used, so the tree structure is shown in Fig. 12a while the corresponding spatial partitioning that corresponds to the tree structure in fig. 12a is shown in Fig. 12b. The partitioning shown there is similar to that shown in relation to Figs. 3a to 3c. Generally speaking, the inheritance regime will allow lateral information to be assigned to nodes located in different non-leaf layers within the tree structure. Depending on the allocation of lateral information for the nodes in different layers of the tree, such as the internal nodes of the tree in fig. 12a or the root node thereof, different degrees of shared lateral information can be achieved within the tree block hierarchy shown in fig. 12b. For example, if it is decided that all leaf nodes of layer 4, which, in the case of fig. 12a have all the same relative nodes, they must share the lateral information, virtually this means that the smaller blocks, in fig. 12b indicated as 156a 156d share this side information and it is no longer necessary to transmit the side information through all these small blocks 156A to 156d, in full, that is, four times, although this is maintained as an option for the encoder. However, it would also be possible to decide that an entire region of a hierarchy level (layer 2) in fig. 12-A, named the fourth part in the upper right corner of the tree block 150,
63/77 including sub-blocks 154a, 154b and 154d as well as the even smaller sub-blocks 156a to 156d recently mentioned, serves as a region in which encoding parameters are shared. Thus, the area of shared side information is increased. The next level of growth would be the sum of all sub-blocks of layer 1, that is, sub-blocks 152a, 152c and 152d and the smaller blocks mentioned above. In other words, in that case the entire tree block would have the side information assigned to it, with all the sub-blocks of this tree block 150 sharing the side information.
In the following description of heredity, the following notation is used to describe the embodiments:
The. Reconstructed samples of the current leaf node: r
B. Reconstructed samples from neighboring leaves: r '
ç. Current leaf node predictor: p
d. Current leaf node residue: Re.s
and. Reconstructed waste from the current leaf node: RecKc.s
f. Scale and inverse transformation: SIT
g. Share flag: f
As a first example of heredity, intra-prediction signaling at internal nodes can be described. To be more precise, it is described how to signal the intrapredicting modes of the internal nodes of a block tree based on partitioning for the purpose of prediction. When traversing the tree from the root node to the leaf nodes, the internal nodes (including the root node) can transmit pieces of lateral information that will be explored by their corresponding descendant nodes. To be more specific, a share flag f is passed to internal nodes with the following meaning:
• If / has a value of 1 (true), all descendant nodes of a given internal node share its intra-prediction mode. In addition to the share flag f with a value of 1, the internal node also signals the intra-prediction mode parameter to be used for all descendant nodes. Consequently, all subsequent descendant nodes do not have any prediction mode information, as well as all share flags. For the reconstruction of all related leaf nodes, the decoder applies the intra-prediction mode of the corresponding internal node.
• If / has a value of 0 (false), the descendant nodes of the corresponding internal node do not share the same intra-prediction mode and each child node that is an internal node carries a separate share flag.
Fig. 12c illustrates the intra-prediction signaling at the internal nodes as described above. The internal layer 1 node transmits the share flag and the
64/77 lateral information that is given by the information of the intra-prediction mode and the descending nodes do not carry any lateral information.
As a second example of heredity, the refinement between prediction can be described. To be more precise, it is described how to signal the lateral information of inter-prediction modes to internal modes of a partitioning of the tree-based block for the purpose of refining motion parameters, such as, for example, data by motion vectors. When traversing the tree from the root node to the leaf nodes, the internal nodes (including the root node) can transmit pieces of lateral information that will be refined by their corresponding descendant nodes. To be more specific, a 10 share flag is passed to internal nodes with the following meaning:
• If f has a value of 1 (true), all descendant nodes of the given internal node share the reference of the same motion vector. In addition to the share flag with a value of 1, the internal node also signals the motion vector and 15 the reference index. Consequently, all subsequent descendant nodes no longer carry the share flags, but you can refine this inherited motion vector reference. For the reconstruction of all related leaf nodes, the decoder adds the refinement of the motion vector in the given leaf node to the reference of the inherited motion vector belonging to the corresponding internal relative node 20 which has a share flag with a value of 1 This means that the motion vector in the refinement of a given leaf node is the difference between the current motion vector to be applied with prediction motion compensation for this leaf node and the reference motion vector of the corresponding parent node. internal.
· If f has a value of 0 (false), the descendant nodes of the corresponding internal node do not necessarily share the inter-prediction mode and even without refinement of the movement parameters it is executed on the descending node, using the movement parameters from the corresponding internal node and each child node that is an internal node carries a separate share flag.
Fig. 12d illustrates the refinement of the motion parameter as described above. The internal node in layer 1 transmits the share flag and the side information. Descending nodes that are leaf nodes carry only the refinement of movement parameters and, for example, the internal descending node in layer 2 does not carry any lateral information.
Reference is now made to fig. 13. Fig. 13 shows a flow diagram illustrating the mode of operation of a decoder, such as the decoder of FIG. 2 reconstructing a set of samples that represent an exemplary spatial information signal, which is
65/77 subdivided into leaf regions of different sizes by multi-tree subdivision, from a data stream. As described above, each leaf region has a hierarchy level associated with a sequence of hierarchy levels in the multi-tree subdivision. For example, all the blocks shown in Fig. 12b are leaf regions. The leaf region 156c, for example, is associated with hierarchy layer 4 (or level 3). Each leaf region has associated coding parameters. Examples of these encoding parameters have been described above. The coding parameters are, for each leaf region, represented by a set of respective syntax elements. Each syntax element is of a respective type of syntax element of a set of types of syntax elements. Such a type of syntax element is, for example, a prediction mode, a motion vector component, an indication of an intra-prediction mode or the like. According to fig. 13, the decoder performs the following steps.
In step 550, inheritance information is extracted from the data stream. In the case of fig. 2, extractor 102 is responsible for step 550. The inheritance information indicates whether the inheritance is used or not for the arrangement of samples of current information. The following description will reveal that there are several possibilities for inheritance information, such as, among other things, the sharing flag f and the signaling of a multitree structure divided into a primary and secondary part.
The array of samples of the information may already be a sub-part of an image, such as a tree block, called the tree block 150 of fig. 12b, for example. Thus, the inheritance information indicates whether the inheritance is used or not for the specific tree block 150. Such inheritance information can be inserted in the data flow for all tree blocks in the prediction subdivision, for example.
In addition, the inheritance information indicates whether the inheritance is indicated to be used at least one region of inheritance from the array of information samples, which is composed of a set of leaf regions and corresponds to a hierarchy level of the sequence of hierarchy levels of the multi-tree subdivision, being smaller than each of the hierarchical levels with which the set of leaf regions is associated. In other words, the inheritance information indicates whether the inheritance will be used or not for the sample arrangement, such as tree block 150. If yes, it denotes at least one inheritance region or subregion of this tree block 150, with which parts of the coding parameter sheet regions. Thus, the region of inheritance cannot be a leaf region. In the example in fig. 12b, this region of inheritance may, for example, be the region formed by subblocks 156a to 156b. Alternatively, the region of inheritance may be larger and may also additionally include subblocks154a, b and d, and even alternatively, the region of inheritance may be tree block 150 with
66/77 all blocks of the same sheet sharing the coding parameters associated with that region of inheritance.
It should be noted, however, that more than one inheritance region can be defined within a sample arrangement or tree block 150, respectively. Imagine, for example, the sub-block in the lower left corner 152c was also divided into smaller blocks. In this case, sub-block 152c can also form an inheritance region.
In step 552, the inheritance information is checked for whether the inheritance is used or not. If so, the process of fig. 13 continues with step 554, where a subset of inheritance including at least one syntax element of a predetermined type of syntax element is extracted from the data stream by inter-inheritance region. In next step 556, this subset of inheritance is then copied to, or used as a prediction for, a corresponding subset of inheritance of syntax elements within the set of syntax elements that represent the encoding parameters associated with the set of leaf regions where at least one region is made up of inheritance. In other words, for each inheritance region indicated in the inheritance information, the data flow comprises a subset of elements of inheritance syntax. In other words, inheritance belongs to at least one type of syntax element or category of syntax element that is available for inheritance. For example, the prediction mode or inter-prediction mode or intra-prediction mode of the syntax element can be inherited. For example, the subset of inheritance contained within the data stream for the region may comprise a succession of the inter-prediction mode of the syntax element. The inheritance subset can also comprise other syntax elements of the types of syntax element, which depends on the value of the aforementioned type of the fixed syntax element associated with the inheritance scheme. For example, if the interpretation mode is a fixed component of the inheritance subset, the syntax elements defining motion compensation, such as motion vector components, may or may not be included in the inheritance subset by syntax. Imagine, for example, the upper right quarter of tree block 150, that is, sub-block 152b, was the region of inheritance, then the inter-prediction mode could only be indicated for this region of inheritance or the mode of inter-prediction together with the motion vectors and motion vector indices.
All syntax elements contained in the inheritance subset are copied or used as a prediction for the respective leaf block encoding parameters within that leaf inheritance region, that is, blocks 154a, b, d and 156a to 156d. In the case of the prediction to be used, the residues are transmitted to the individual sheet blocks.
67/77
One possibility to transmit the inheritance information to the tree block 150 is the aforementioned transmission of a share flag f. The extraction of the inheritance information in step 550 may, in this case, have the following composition. In particular, the decoder can be configured to extract and verify, for non-leaf regions corresponding to any one of a succession, to define at least one hierarchy level of the multi-tree subdivision, using a hierarchy level order from from the lowest level of the hierarchy to the highest level of the hierarchy, the share flag f of the data stream, to find out whether the respective flag or inheritance share flag prescribes inheritance or not. For example, the hierarchy level inheritance set can be formed by hierarchy layers 1-3 in fig. 12th. Thus, for any of the nodes in the subtree structure not being a leaf node and located within any of layers 1-3, it may have a share flag associated with it in the data stream. The decoder extracts these sharing flags in order from layer 1 to layer 3, as in a first search order in depth and width. As soon as one of the share flags is equal to 1, the decoder knows that the leaf blocks contained in a proportion corresponding to the inheritance region of the inheritance subset subsequently extracted in step 554. For nodes descending from the current node, a check for inheritance is no longer needed. In other words, the inheritance flags for these descendant nodes have not been transmitted in the data stream, since it is evident that the area of these nodes already belongs to the region of inheritance within which the subset of elements of inheritance syntax is shared.
The share flags f can be interleaved with the aforementioned signal bits in the quadtree subdivision. For example, an interleaved bit stream including both subdivision flags as well as the share flags can be:
10001101 (0000) 000, which is the same information as the subdivision, as illustrated in fig. 6a with the two sharing flags interspersed, which are highlighted as underlined in order to indicate that, in fig. 3c all the sub-blocks in the lower left corner of the tree block 150 share the coding parameters.
Another way to define the inheritance information indicating the region of inheritance would be to use two sub-divisions defined in a dependent way on each other, as explained above in relation to the prediction and residual subdivision, respectively. Generally speaking, the leaf blocks of the primary subdivision can form the inheritance region to define the regions in which the inheritance subsets of syntax elements are shared, while the subordinate subdivision defines the blocks within these regions.
68/77 inheritance for which the subset of inheritance syntax elements is copied or used as a prediction.
Consider, for example, the residual tree, as an extension of the prediction tree. In addition, consider the case where the prediction blocks can be further divided into smaller blocks for the purpose of residual coding. For each prediction block that corresponds to a leaf node of the related quadtree prediction, the subdivision corresponding to the residual encoding is determined by one or more subordinate quadtree (s).
In this case, instead of using any prediction signaling on internal nodes, we consider the residual tree to be interpreted so as to also specify a refinement of the prediction tree, in the sense of using a constant prediction mode (signaled by the leaf node of the corresponding tree related to the prediction), but with the refined reference samples. The following example illustrates this case.
For example, fig. 14a and 14b show a quadtree partitioning for intra prediction with neighboring reference samples being highlighted for a specific leaf node in the primary subdivision, while Fig. 14b shows the residual quadtree subdivision for the same leaf node prediction with refined reference samples . All the sub-blocks shown in Fig. 14b have the same interpretation parameters contained within the data flow for the respective sheet block highlighted in fig. 14a. Thus, fig. 14a shows an example for conventional quadtree operation for intra prediction, where reference samples for a specific leaf node are described. In our preferred embodiment, however, a split intra prediction signal is calculated for each leaf node in the residual tree using neighboring samples already reconstructed from leaf nodes in the residual tree, for example, as indicated by the gray shaded stripes in fig . 4 (b). Then, the reconstructed signal from a given residual leaf node is obtained in the usual way, adding the residual quantized signal to this prediction signal. This reconstructed signal is then used as a reference signal for the next prediction process. Note that the decoding order for the prediction is the same as the residual decoding order.
In the decoding process, as shown in Figure 15, for each residual leaf node, the prediction signal p is calculated according to the current intra-prediction mode (as indicated by the predicted quadtree leaf node) using the samples of reference.
After the process,
Re c Re s = SIT (Re 5) reconstructed signal is calculated and stored for the next calculation prediction process:
69/77 r = Re c Re s + p f decoding order for the prediction is the same as the residual decoding order, which is illustrated in Figure 16.
Each residual leaf node is decoded as described in the previous paragraph. The reconstructed signal is stored in a buffer, as shown in Figure 16. Outside this memory, the reference samples will be taken to the next prediction and decoding process.
After having described specific modalities in relation to Figs. 1-16 with combined subsets distinct from the aspects outlined above, additional embodiments of the present patent application are described that focus on certain aspects already described above, but which represent generalized embodiments of some of the embodiments described above.
In particular, the embodiments described above, with respect to the structure of Figs. 1 and 2, particularly combined many aspects of the present patent application, which would also be advantageous when used in other applications or other coding fields. As frequently mentioned during the discussion above, the multitree subdivision, for example, can be used without the merger and / or the adoption / prediction plan and / or the inheritance. For example, the transmission of the maximum block size, the use of the transversal order of depth, the adaptation of the context, depending on the hierarchy level of the flag of the respective subdivision and the transmission of the maximum hierarchy level within the bit stream, in order to to save the bit rate of the side information, all these aspects are advantageous independently of each other. This is also true when considering the merger scheme. The merger is advantageously independent of the exact way that an image is subdivided into simply linked regions and is advantageously independent of the existence of more than one sample arrangement or the use of the adoption / prediction and / or inheritance plan. The same applies to the advantages involved with inter-plan adoption and prediction and inheritance.
Therefore, the embodiments described below generalize the embodiments already mentioned in relation to aspects related to the merger. As the following embodiments represent generalizations of the embodiments described above, many of the details described above can be considered to be combined with the embodiments described below.
Fig. 17 shows a decoder according to an embodiment of the present application. The decoder of FIG. 17 comprises an extraction device 600 and a reconstructor 602. Extractor 600 is configured to extract, for each of a plurality of regions simply connected in which a set of samples of
70/77 information representing an information signal is subdivided spatially sampled, the load data from a data stream 604. As described above, the simply connected regions into which the set of information samples is subdivided can result from a multi-tree subdivision and can be square or rectangular in shape. Furthermore, the embodiments described specifically for subdividing a sample arrangement are merely specific embodiments and other subdivisions can be used as well. Some possibilities are shown in Fig. 18ac and Fig. 18a, for example, showing the subdivision of a 606 sample arrangement into a regular two-dimensional arrangement of non-overlapping contacting tree blocks 608, with some of which being subdivided into compliance with the structure in 610 multitree sub-blocks of different sizes. As mentioned above, although a quadtree subdivision is illustrated in Fig. 18a, a division of each parent node into any other number of descendant nodes is also possible. Fig. 18b shows an embodiment according to which a sample arrangement 606 is subdivided into sub-blocks of different sizes by applying a multitree subdivision directly over the entire pixel arrangement 606. That is, an arrangement of entire pixel 606 is treated as the tree block. Fig. 18c shows another embodiment. According to this embodiment, the sample arrangement is structured in a regular two-dimensional arrangement of macroblocks of square or rectangular shapes, which are contacted and each one is individually 612 macroblocks associated with the partitioning information according to which a macroblock 612 is left unpartitioned or is divided into a regular two-dimensional arrangement of blocks of a size indicated by the partitioning information. As can be seen, all subdivisions of Figs. 13a-13c leads to a subdivision of the sample arrangement 606 into regions that are simply connected in an exemplary manner, according to the embodiments of Figs. 18a and 18b, without overlap. However, several alternatives are possible. For example, blocks can overlap each other. The overlap can, however, be limited in such a way that each block has a portion not overlapped by any neighboring block, or in such a way that each of the block samples is overlapped by at most one block between the adjacent blocks arranged in juxtaposition to the current block along a predetermined direction. This means that neighboring blocks to the left and right can overlap the current block, so as to fully cover the current block, but they cannot overlap each other, and the same applies to neighbors in the vertical and diagonal direction.
As described above with reference to Figs. 1 to 16, the array of sample information does not necessarily represent a video image or a still image. The 606 sample arrangement can also represent a depth map or
71/77 a transparency map of some scene.
The load data associated with each of the plurality of regions, simply linked can, as already discussed above, comprise the residual data in the spatial domain or in a transformation domain, such as transformation coefficients and a map of significance identifying the positions significant transformation coefficients within a transform block corresponding to a residual block. In general, the payload data extracted by the extractor 600 for each region simply linked from the data stream 604 is data that describe its spatially associated region simply linked, both in the spatial domain and in a spectral domain and both directly and indirectly. a residue for a kind of prediction, for example.
The reconstructor 602 is configured to reconstruct the sample arrangement of the information from the load data for the simply linked regions of the sample array of the processing information, for each simply linked region, the load data for the respective simply linked regions, in a prescribed manner by coding parameters associated with the respective regions simply connected. Similar to the discussion above, the coding parameters can be prediction parameters and accordingly, the simply linked regions shown in Figs. 18a-18b can correspond to the aforementioned prediction blocks, i.e., blocks of units in which data stream 604 defines prediction details to predict simply linked individual regions. However, the encoding parameters are not restricted to the prediction parameters. The coding parameters could indicate a transformation used to transform the load data, or you can define the filter to be used in the reconstruction of the individual regions simply linked when reconstructing the array of samples of the information.
Extractor 600 is configured to identify, for a simply linked predetermined region, simply linked regions within the plurality of simply linked regions that have a relative relationship of predetermined location to the simply linked predetermined region. Details regarding this step have been described above in relation to step 450. That is, in addition to the relative relationship of predetermined location, identification may depend on a subset of the encoding parameters associated with the predetermined region simply linked . After identification, extractor 600 extracts a fusion indicator for the simply linked predetermined region from data stream 604. The number of simply linked regions that have the relative location ratio predetermined to the simply linked predetermined region is greater than zero. This corresponds to the previous description of steps 452 and 454. If the fusion indicator as suggested by
72/77 processing of the predetermined junction block, the extractor 600 is configured to check if the number of simply linked regions that have the relative relationship of predetermined location to the simply linked predetermined region is one, or if the number of simply linked regions that have the relative predetermined location relationship to the simply linked predetermined region is greater than one, with, however, their coding parameters being identical to each other. If one of the two alternatives, the extractor adopts the coding parameters, or uses them for a prediction of the coding parameters of the predetermined region or simply linked to the remaining subset of the same, as described above in relation to steps 458468. Such as described above with reference to fig. 10, an additional indicator can be extracted by the extractor 600 from the data stream 604 in case the later checks reveal that the number of simply linked regions that have the relative relationship of predetermined location to the simply linked predetermined region is higher to one and have different encoding parameters between them.
Through the use of subsequent controls, the transmission of another indicator that indicates one or a subset of candidate regions simply linked can be suppressed by reducing the overhead of lateral information.
Fig. 19 shows the general structure of an encoder to generate a stream of data decodable by the decoder of fig. 17. The encoder in fig. 19 comprises a data generator 650 and an insertion device 652. The data generator 650 is configured to encode the set of information samples into load data for each of a plurality of interconnected regions in which the array of information samples it is subdivided, together with the associated coding parameters associated with the respective simply linked regions, in order to indicate how the payload data for the respective simply linked region should be reconstructed. The insertion device 652 performs identification and control, like the extractor 600 of the decoder of FIG. 12 did, but performs an insertion of the merge indicator, instead of its extraction, and suppresses an insertion of coding parameters for the data stream or replaces an insertion of the coding parameters for the data stream in its entirety with an insertion of the respective residual prediction instead of the adoption / prediction described above with reference to fig. 12 and fig. 10, respectively.
In addition, it should be noted that the encoder structure of fig. 19 is somewhat schematic and in fact, the determination of the payload data, the coding parameters and the fusion indicator can be an iterative process. For example, if the encoding parameters of neighboring regions simply linked are similar but not identical to each other, an iterative process that can determine the small differences between these encoding parameters may be preferred over those
73/77 differences in signaling for the decoder when considering that the fusion indicator allows the suppression of the coding parameters of one of the regions simply connected completely and to replace the presentation of these coding parameters entirely with the presentation of a residue only.
Fig. 20 shows another embodiment of a decoder. The decoder of FIG. 20 comprises a subdivisor 700, a fuser 702 and a reconstructor 704. The subdivisor is configured to spatially subdivide, depending on the subset of syntax elements contained in the data stream, a set of samples that represent a spatial sampling of the two-dimensional information signal in a plurality of non-overlapping regions simply linked in different sizes by recursive multi-partitioning. Thus, the multi-partitioning may correspond to the embodiments described above in relation to Figs. 1-16 or Fig. 18a, respectively, or with fig. 18b. The syntax element contained in the data stream to indicate the subdivision, can be defined as indicated above in relation to Figs. 6a and 6b or alternatively.
Fuser 702 is configured to combine, according to a second subset of data flow syntax elements, spatially linked neighboring regions are separated from the first subset, from the plurality of regions simply linked to obtain an intermediate subdivision of the sample array in disjunct sets of simply connected regions, the union which has a plurality of simply connected regions. In other words, fuser 702 combines the simply linked regions and specifies them in a unique way to fuse the groups of simply linked regions. The second subset of syntax elements just mentioned, indicating the merger information can be defined as shown above, with respect to fig. 19 and fig. 10, respectively, or in some other way. The ability, however, for the encoder to indicate a subdivision by the use of a disjoint subset of the subset by which fusion is indicated increases the freedom to adapt the encoder for the subdivision of the sample array with the actual content of the sample arrangement of so that coding efficiency can be increased. Reconstructor 704 is configured to reconstruct the array of samples from the data stream using the intermediate subdivision. As indicated above, the reconstructor can explore the intermediate subdivision by adopting / predicting interleaved partner coding parameters for a current region simply linked. Alternatively, the same reconstructor 704 applies a transformation or a prediction process to the combined region of the group resulting from the merger of the simply linked regions.
Fig. 21 shows a possible encoder for generating a stream of data decodable by the decoder of fig. 15. The encoder comprises a subdivision / fusion phase
74/77
750 and a data flow generator 752. The subdivision / function phase is configured to determine an intermediate subdivision of a set of information samples that represents a spatial sampling of a two-dimensional information signal and two disjoint sets of simply connected regions , the union which is the plurality of simply connected regions, defining this intermediate subdivision by means of a first subset of syntax elements according to which the array of information samples is subdivided into a plurality of non-overlapping simply connected regions of different sizes by recursive multi-partitioning, and a second subset of syntax elements being separated from the first subset according to which the simply connected spatially neighboring regions, from the plurality of simply linked regions are combined to obtain the intermediate subdivision. The data flow generator 752 uses the intermediate subdivision in order to encode the set of samples of information in the data flow. The subdivision / merger step 750 inserts the first and second subsets into the data stream, too.
Again, as in the case of fig. 14, the process of determining the first and second subsets, and the syntax elements generated by the data flow generator 752 can be an iteratively operating process. For example, the subdivision / fusion step 750 can preliminarily determine an optimal subdivision where after the data flow generator 752 determines a corresponding set of syntax elements ideal for coding the sample arrangement using the sample subdivision with the phase subdivision / merger then adjusts the syntax element to describe the merger, so that the side information overhead is reduced. However, the encoding process cannot stop here. Instead, the subdivision / fusion phase 750, together with the flow generating data 752 can cooperate to try to bypass the optimal subdivision configurations and syntax elements previously determined by the data flow generator 752 in order to determine whether a better ratio / distortion ratio is achieved by exploring the positive properties of the fusion.
As described above, the embodiments described in relation to Figs. 17-21 represent generalizations of the embodiments described previously in relation to Figs. 1-16 and accordingly, it is possible to associate the elements of Figs. 1-16 for the elements shown in Figs. 17-21. For example, extractor 102 together with subdivisor 104 and fuser 104b assume responsibility for the tasks to be performed by extractor 600 in the figure. 17. The subdivisor is responsible for subdividing and managing neighborly relations between individual regions that are simply linked. Fuser 104b, in turn, controls the fusion of regions simply linked to a group and locates the correct encoding parameters to be copied or to be used as a prediction for the current regions simply linked in the event of a fusion event
75/77 be indicated by the currently decoded fusion information. Extractor 102 takes responsibility for extracting real data from the data stream using the correct context in the case of using entropy decoding for data extraction. The remaining elements of fig. 2 is an example for the reconstructor 602. Naturally, the reconstructor 602 may be different from the embedded one shown in fig. 2. For example, the 602 reconstructor cannot use prediction and / or intra-prediction motion compensation. Instead, other possibilities can also be applied.
Furthermore, as mentioned above, the simply linked regions mentioned in connection with the description in fig. 17, corresponds, as already indicated above, both for the prediction blocks mentioned above or any of the other subdivisions mentioned above, as well as the residual subdivision or the filter subdivision.
When comparing the encoder in fig. 19, with the example of fig. 1, data generator 650 which, in addition to encompassing all data stream element inserters 18, while the latter would correspond to inserter 652 in Fig. 19. Again, data generator 650 can use another encoding approach than the hybrid coding approach shown in fig. 1.
When comparing the decoder of fig. 20, with the example shown in fig. 2, the fusion subdividers 104a and 104b would correspond to subdivisor 100 and fuser 102 of the figure. 20, respectively, while elements 106 and 114 correspond to reconstructor 704. Extractor 102 would normally have the functionality of all elements shown in FIG. 20.
With regard to the encoder of fig. 21 is concerned, the subdivision / fusion step 750 that would correspond to the fusion subdivisor 28 and 30, while the data from the flow generator 752 encompass all the other elements represented in fig. 10.
Although some aspects have been described in the context of an apparatus, it is evident that these aspects also represent a description of the corresponding method, where a block or a device corresponds to a step of the method, or a characteristic of the method of a step. Similarly, the aspects described in the context of a method step also represent a description of a corresponding block or item or characteristic of a corresponding device. Some or all of the process steps can be performed by (or with) a hardware device, such as, for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some of one or more more important steps of the method can be performed by such a device.
The inventive encoded / compressed signals can be stored on a digital storage medium or can be transmitted on a transmission medium, such as a wireless transmission medium or a wired transmission medium, such as the Internet.
76/77
Depending on the implementation requirements, certain embodiments of the invention can be implemented in hardware or in software. The implementation can be carried out using a digital storage medium, for example, a floppy disk, a DVD, a Blu-Ray, a CD, a ROM, a PROM, an EPROM, an EEPROM or a flash memory, showing control signals. electronic reading stored in it, which cooperate (or are capable of cooperating) with a programmable computer system so that the respective method is carried out. Therefore, the digital storage medium can be read by computer.
Some embodiments according to the invention comprise a data carrier with electronically readable control signals, which are capable of cooperating with a programmable computer system, so that one of the methods described here is performed.
Generally, the embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative to perform one of the methods in which the computer program product is executed on a computer. The program code, for example, can be stored in an optical reader.
Other modalities include the computer program to execute one of the methods described in this document, stored in an optical reader.
In other words, an embodiment of the method of the invention is, therefore, a computer program with a program code for performing one of the methods described herein, when the computer program is executed on a computer.
Another embodiment of the methods of the invention is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium), which includes the computer program for carrying out one of the methods described herein. .
Another embodiment of the method of the invention is, therefore, a data stream or a sequence of signals representing the computer program for carrying out one of the methods described herein. The data stream or signal sequence can, for example, be configured to be transferred over a data communication link, for example, over the Internet.
An embodiment also comprises a processing means, for example, a computer or a programmable logic device, configured for or adapted to perform one of the methods described herein.
Another embodiment comprises a computer having a computer program installed on it to perform one of the methods described herein.
77/77
In some embodiments, a programmable logic device (for example, a programmable field gate arrangement - field programmable gate array) can be used to perform some or all of the functionality of the methods described here. In some embodiments, a programmable field gate arrangement may cooperate with a microprocessor to perform one of the methods described herein. Generally, the methods are preferably performed by any hardware device.
The embodiments described above are merely illustrative of the principles of the present invention. It is understood that modifications and variations of the arrangements and 10 details described here will be evident to other technicians in the subject. It is therefore intended to be limited only by the scope of the attached patent claims and not by the specific details presented as a description and explanation of the embodiments of the present invention.
权利要求:
Claims (30)
[1]
1. Decoder, characterized by comprising:
an extractor configured to extract, for each of a plurality of regions, simply connected in which a set of samples qqe represent an information signal the spatially tested information is sub-divided, load data from a data stream, and a reconstructor configured to reconstruct the sample set of information from the load data for the simply linked regions of the set of information samples, by processing, for each region simply linked, the payload data for the respective region simply linked in a manner prescribed by coding parameters associated with the respective simply linked region, where the extractor is further configured to identify, for a simply linked predetermined region, simply linked regions within the plurality of simply linked regions that have a relative relationship predetermined location for re simply linked predetermined region, if the number of regions that have the relative link of predetermined location simply linked to the simply linked predetermined region is greater than zero, simply extract a fusion indicator for the predetermined region linked from the data stream, if the fusion indicator suggests processing the predetermined fusion block, if the number of regions that have the relative relationship of predetermined location simply linked to the predetermined region simply linked is one, adopt the encoding parameters of the simply linked region, as the encoding parameters for the simply linked predetermined region, or predict the encoding parameters for the simply linked predetermined region from the encoding parameters of the simply linked regions with the relative relation of predetermined location for the re simply linked predetermined region and extract a residual prediction for the simply linked predetermined region from the data stream.
[2]
2. Decoder according to claim 1, characterized in that the extractor is additionally configured so that if the number of regions that have the relative ratio of predetermined location simply linked to the predetermined region simply linked is greater than 1, verify that the coding parameters associated with the simply linked regions are identical to each other and, if so, adopt the coding parameters of the simply linked regions as the coding parameters for the simply linked predetermined region or provide the coding parameters for the predetermined region simply linked from the encoding parameters of the number of regions simply linked having the relative relation of
2/12 predetermined location for the simply linked predetermined region, with prediction of residual extraction from the data stream.
[3]
3. Decoder according to claim 1 or 2, characterized in that it additionally comprises a sub-divisor configured to spatially subdivide, using a multitree subdivision, the set of information samples for the regions linked by multi-partitioning simply recursive in such a way that the simply connected regions have different sizes, in which the device is configured to process the simply connected regions in a transversal order in depth.
[4]
4. Decoder according to any one of claims 1 to 3, characterized in that the sub-divider is configured to divide the spatial set of information samples for the root tree regions of a maximum size and subdivide the region, according to the multi-tree subdivision information, at least a subset of the root regions of trees in small regions simply linked in different sizes by recursive multi-partitioning of the subset of regions of the root tree, the regions of the root tree not partitioned accordingly with the information multi-tree subdivision, and the simply linked smaller regions that form the plurality of simply linked regions.
[5]
Decoder according to any one of claims 1 to 4, characterized in that the apparatus is configured so that, in the identification of the simply connected regions that have the relative relationship of predetermined location to the simply connected predetermined region, the regions are identified simply linked within which, if applicable, samples of information adjacent to a sample of information at a border of the predetermined region simply linked.
[6]
6. Decoder according to any one of claims 1 to 5, characterized in that the apparatus is configured to extract, before identification, a first subset of the encoding parameters for the predetermined region simply connected from the data stream, in the identification of simply linked regions that have the relative relationship of predetermined location to the predetermined region simply linked, identify those simply linked regions that have the relative relationship of predetermined location to the predetermined region simply linked and, at the At the same time, having the associated coding parameters that fulfill a predetermined relationship with the first subset of the coding parameters for the simply linked predetermined region and perform the adoption or prediction with extraction of the residual prediction, in a second subset of coding parameters for the predetermined region simply
3/12 linked, dissociate from the first subset of coding parameters conditionally, depending on the number of regions simply linked with the relative relationship of predetermined location to the predetermined region simply linked, and at the same time, having parameters associated coding parameters that satisfy the predetermined relation for the first subset of the coding parameters for the simply linked predetermined region.
[7]
Decoder according to any one of claims 1 to 6, characterized in that the device is configured to extract, if the coding parameters associated with the number of regions that have the relative predetermined location ratio simply linked to the predetermined region simply linked are unequal to each other, a reference identifier neighboring the data stream, the identification of an appropriate subset of the number of regions that have the simply linked relative relationship of predetermined location to the simply linked predetermined region and adopt the encoding parameters of this appropriate subset, such as encoding parameters for the predetermined region or predicting the simply linked encoding parameters of the predetermined region simply linked from the appropriate subset, extracting with a residual prediction from the data stream .
[8]
8. Decoder, characterized by comprising:
an extractor configured to extract, for each of a plurality of regions, simply linked in which a set of samples representing an information signal the spatially tested information is sub-divided, load data from a data stream, and a reconstructor configured to reconstruct the sample set of information from the load data for the simply linked regions of the set of information samples, by processing, for each region simply linked, the payload data for the respective region simply linked in a manner prescribed by coding parameters associated with the respective simply linked region, where the extractor is further configured to extract a first subset of the coding parameters for the simply linked predetermined region from the data stream, identifying, for a region predetermined simply linked, simple regions linked within the plurality of simply linked regions that have a relative relationship of predetermined location to the predetermined region simply linked, if the number of regions that have the relative relationship simply predetermined to location predetermined simply linked is greater than zero, extract a fusion indicator for the simply linked predetermined region from the
4/12 data flow, if the fusion indicator suggests a processing of the predetermined fusion block, calculate, for each of the plurality of regions, simply linked with the relative relation of predetermined location to the predetermined region simply linked, at a distance according to a predetermined distance, between the first subset of the coding parameters of the predetermined region simply linked and a corresponding subset of the coding parameters of the respective region simply linked with the relative relation of predetermined location -determined for the simply linked predetermined region, and adopting the corresponding subset of the coding parameters of the simply linked minimum distance region having as a second subset of the coding parameters for the simply linked predetermined region, separated from the first subset , or predict the second the subset of the coding parameters for the predetermined region simply linked from the corresponding subset of the coding parameters of the region simply linked minimum distance from extracting a residual prediction to the predetermined simply linked region from the data stream.
[9]
9. Decoder to decode a data stream for which a two-dimensional information signal is encoded, characterized by comprising:
a sub-divisor configured for spatial sub-division, depending on a first subset of syntax elements contained in the data stream, a set of information samples that represent a spatial sampling of the two-dimensional information signal in a plurality of simply connected regions of different sizes by recursive multi-partitioning;
a fusion configured to combine, according to a second subset of syntax elements within the data stream, being dissociated from the first subset, spatially the neighboring regions simply linked, from the plurality of regions simply linked to obtain an intermediate subdivision of the set samples of information in dissociated sets of simply linked regions, the union of which is the plurality of simply linked regions and a reconstructor configured to reconstruct the set of samples from the data information flow using the intermediate subdivision.
[10]
10. Decoder according to claim 8, characterized in that the sub-divider is configured to divide the spatial set of information samples for the root tree regions of a maximum size and subdividing the region, according to the first subset of syntax elements, at least a subset of the root tree regions in small regions simply linked in different sizes by recursive multipargeting of the subset of the tree's root regions, the root regions
5/12 of trees not divided according to the multi-tree subdivision information, and the smaller regions simply linked together forming the plurality of regions simply linked together.
[11]
11. Decoder according to claim 10, characterized in that the sub-divider is configured to divide the set of samples of the information in such regions of the root tree, where the regions of the root tree are rectangular blocks of size determined by maximum size of the region, regularly arranged in order to gap the range of information samples.
[12]
12. Decoder according to claim 10 or 11, characterized in that the subdivisor is configured to, when subdividing the subset of regions of the tree root, check, for each region of the tree root, the first subset of syntax elements to know if the root region of the respective tree is to be partitioned, and if the root region of the respective tree is to be partitioned, partition the region of the respective tree root into regions of a first level of hierarchy according to an associated partition rule with the first hierarchy level, and repeat recursive checking and partitioning for regions of the first hierarchy level, in order to obtain regions of higher order levels using associated partition rules, stop recursive repetition when none additional separation is to be performed according to the first subset of syntax elements, or a level of hierarchy maximum archia is reached, in which the regions of the subset of regions of the root tree no longer partitioned according to the first subset of syntax elements, represent the simply connected smaller regions and the leaf regions of the multi-tree subdivision, respectively.
[13]
13. Decoder according to claim 12, characterized in that the maximum hierarchy level is indicated by the first subset of syntax elements.
[14]
14. Decoder according to claims 12 or 13, characterized in that the sub-divider is configured to, according to the partition rules associated with the first levels of hierarchy and higher order, perform a division into sub-regions of the same size, the number of sub-regions thus obtained being common to all hierarchical levels.
[15]
Decoder according to any of claims 12 to 14, characterized in that the reconstructor is configured to extract the first syntax elements associated with the leaf regions of the subset of regions of the root tree in the transversal order of depth from the flow of Dice.
[16]
16. Decoder according to any one of claims 12 to 15,
6/12 characterized in that the first subset of syntax elements has a partition indication flag associated with each region of the tree root and region of the first hierarchical and higher order levels not belonging to the regions of the maximum hierarchy level, respectively, indication partition flags that indicate whether the region of the associated tree root and region of first level of hierarchy and of higher order, respectively, is partitioned.
[17]
17. Decoder according to any one of claims 9 to 16, characterized in that the reconstructor is configured to perform one or more of the following measurements at a particle size corresponding to the intermediate subdivision:
decision, what is the prediction mode between, at least, the intra and inter prediction mode to use; transformation to the spatial or spectral domain;
execute and / or create parameters for, an inter-prediction and execute and / or create parameters for, an intra prediction.
[18]
18. Decoder according to any one of claims 9 to 16, characterized in that the reconstructor is configured to perform one or more of the following measurements at a particle size that depends on the intermediate subdivision:
decision, what is the prediction mode between, at least, the intra and inter prediction mode to use; transformation to the spatial or spectral domain;
execute, and / or adjust the parameter for an inter-prediction;
execute and / or create the parameters for an intra prediction;
extraction and definition of coding parameters;
extraction of the encoding parameter predictions with encoding parameter extraction residues and the definition of encoding parameters by combining the encoding parameter predictions and the encoding parameter residues of a finer granularity determined by simply subdividing the small regions connected.
[19]
19. Decoding method, characterized by comprising:
extract, for each of a plurality of simply connected regions in which a set of samples representing an information signal the spatially tested information is sub-divided, load data from a data stream, and reconstruction of the sample set of the information from the load data for the simply linked regions of the set of information samples, through processing, for each simply linked region, the payload data for the respective region simply linked in a manner prescribed by associated coding parameters the respective simply linked region, where the extraction comprises identifying, for a simply linked predetermined region, regions simply
7/12 linked within the plurality of regions, simply linked which have a relative relationship of predetermined location to the predetermined region simply linked, if the number of regions that have the relative relationship simply linked to a predetermined location for simply linked predetermined region is greater than zero, extract from a fusion indicator for the simply linked predetermined region from the data stream, if the fusion indicator suggests processing the predetermined fusion block, if the number of regions that have the relative relation of predetermined location simply linked to the predetermined region simply linked is one, adopt the encoding parameters of the simply linked region, like the encoding parameters for the predetermined region simply linked , or predict the encoding parameters for the predetermined region simply by linking taken from the coding parameters of the regions simply linked with the relative relationship of predetermined location to the predetermined region simply linked to extracting a residual prediction for the predetermined region simply linked from the data stream.
[20]
20. Method for decoding a data stream for which a two-dimensional information signal is encoded, characterized by comprising:
spatial sub-division, depending on a first subset of syntax elements contained in the data stream, a set of samples of information that represents a spatial sampling of the two-dimensional information signal in a plurality of regions simply linked in different sizes by recursive multi-partitioning ;
combine, depending on a second subset of syntax elements within the data stream, the neighboring regions simply linked are dissociated from the first spatial subset, from the plurality of regions simply linked to obtain an intermediate subdivision of the sample set of information into dissociated sets of simply linked regions, the union which is the plurality of simply linked regions, and to reconstruct the sample set from the data information flow using the intermediate subdivision.
[21]
21. Encoder, characterized by being configured to encode a set of samples that represent an information signal and the information spatially tested in load data for each of a plurality of regions, simply linked in which the set of information samples is subdivided, and the coding parameters associated with the respective simply connected region in order to prescribe how the payload data for the respective simply connected region should be reconstructed from the payload data for the respective simply connected region, where the encoder is still configured for
8/12 identify, for a simply linked predetermined region, simply linked regions within the plurality of simply linked regions that have a relative predetermined location relationship to the simply linked predetermined region, if the number of regions that have the relative ratio of predetermined location simply linked to the predetermined region simply linked is greater than zero, insert fusion indicator for the predetermined region simply linked to the data flow, if the fusion indicator suggests processing of the predetermined fusion block, if the number of regions that have the relative relationship of predetermined location simply linked to the predetermined region simply linked is one, do not enter the encoding parameters of the predetermined region simply linked to data flow, or prediction of coding parameters for the region predetermined simply linked from the encoding parameters of the simply linked region with the relative relationship of predetermined location to the predetermined region simply linked with the insertion of a residual prediction for the predetermined region simply linked within the data stream .
[22]
22. Encoder according to claim 21, characterized in that it is still configured so that if the number of regions that have the relative location ratio simply linked predetermined for the predetermined region simply linked is greater than 1, check if the parameters encoding parameters associated with the simply linked regions are identical to those of the others and, if so, do not enter the encoding parameters for the predetermined region simply linked within the data stream or predict the encoding parameters for the predetermined region simply linked from the coding parameters of the number of regions that have the relative relationship of simply linked location predetermined to the predetermined region simply linked, with the insertion of residual prediction in the data stream.
[23]
23. Encoder to generate a data stream for which a two-dimensional information signal is encoded, characterized by comprising:
a subdivision / fusion phase configured to determine a first subset of syntax elements that define a spatial subdivision of a set of samples of information that represents a spatial test of the two-dimensional information signal in a plurality of regions simply linked in different sizes by recursive multi-operation , and the second subset of syntax elements to be decoupled from the first subset, the definition of a combination of neighboring regions spatially simply linked, of the plurality of linked regions simply to obtain an intermediate subdivision of the sample set of information in fusion with
9/12 sets of simply linked regions, the union of which is the plurality of simply linked regions, and a data flow generator configured to encode the sample set into a data information stream using the intermediate subdivision with the first subdivision insertion and second subsets of syntax elements for the data flow.
[24]
24. Encoder, characterized by configured to encode a set of samples that represent an information signal the information spatially tested in load data for each of a plurality of regions, simply linked in which the set of information samples is subdivided, and the coding parameters associated with the respective simply connected region in order to prescribe how the payload data for the respective simply connected region must be reconstructed from the payload data for the respective simply connected region, where the encoder is further configured to insert a first subset of the coding parameters for a predetermined region simply linked to the data stream, to identify, for the simply linked predetermined region, simply linked regions within the plurality of simply linked regions that have a relative location relationship pre-det ended for the simply linked predetermined region, if the number of regions that have the relative link of the predetermined location simply linked to the simply linked predetermined region is greater than zero, insert a merge indicator for the predetermined region -determined simply linked to the data stream, if the fusion indicator suggests processing the predetermined fusion block, calculate, for each of the plurality of regions, simply linked with the relative relationship of predetermined location to the predetermined region -determined simply linked, at a distance according to a predetermined distance measure, between the first subset of the coding parameters of the predetermined region to a corresponding simply linked subset of the coding parameters of the region of the respective simply connected with the relative relation of predetermined location for region p simply linked re-determined, and do not insert a second subset of the encoding parameters for the predetermined dissociated simply linked region, from the first subset, into the data stream, or predict the second subset of the encoding parameters for the region predetermined simply linked from a corresponding subset of the minimum distance coding parameters of the simply linked region having the insertion of a residual prediction for the predetermined region simply linked within the data stream.
[25]
25. Method of coding a set of samples, characterized by
10/12 represent an information signal the information spatially tested in load data for each of a plurality of simply linked regions in which the set of information samples is subdivided, and encoding the parameters associated with the respective region simply connected in a way to prescribe how the payload data for the respective simply linked region should be reconstructed from the payload data for the respective simply linked region, where the method comprises identifying, for a predetermined simply linked region, regions simply linked within the plurality of simply linked regions that have a predetermined relative location relationship to the simply linked predetermined region, if the number of regions that have a simply linked predetermined relative location relationship to the predetermined region determined simply linked is greater than z ero, insert the fusion indicator for the predetermined region simply linked to the data flow, if the fusion indicator suggests processing the predetermined fusion block, if the number of regions that have the relative location relationship simply linked predetermined for the simply linked predetermined region is one, do not enter the coding parameters for the predetermined region simply linked to the data stream, or the prediction of the coding parameters for the predetermined region simply linked from the encoding parameters of the region simply linked with the relative relationship of predetermined location to the region predetermined simply linked with the insertion of a residual prediction for the region predetermined simply linked within the data stream.
[26]
26. Method for generating a data stream for which a two-dimensional information signal is encoded, characterized by comprising:
determining a first subset of syntax elements that define a spatial subdivision of a set of information samples that represent a spatial test of the two-dimensional information signal in a plurality of regions simply linked in different sizes by recursive multi-partitioning, and the second subset of syntax elements being dissociated from the first subset, the definition of a combination of spatially linked regions simply linked, of the plurality of regions simply linked to obtain an intermediate subdivision of the sample set of information into dissociated sets of simply linked regions, whose union it is the plurality of regions simply connected, and the coding of the set of samples of information in a data sequence using the intermediate subdivision with the insertion of the first and second subsets of syntax elements for the data flow.
12/11
[27]
27. Digital computer-readable storage medium, characterized in that a computer program with program code is stored therein to execute, when executed on a computer, a method according to any one of claims 19, 20, 25 and 26.
[28]
28. Flow of encoded data, characterized by presenting in it a set of samples that represent an information signal and spatially tested information, the data flow comprising load data for each of a plurality of regions, simply linked in which the set of samples of information is subdivided, and the coding parameters associated with the respective simply linked region, in order to prescribe how the payload data for the respective simply linked region should be reconstructed from the payload data for the respective region simply linked, where the data stream also comprises a fusion indicator for the predetermined regions simply linked so that the number of simply linked regions within the plurality of simply linked regions that have a relative predetermined location relationship to the predetermined regions simply linked, for May r than zero if the fusion indicator for a respective predetermined region simply linked suggests interleaved processing, and if the number of regions that have the relative relationship of predetermined simply linked location to the respective predetermined region simply linked is one, the absence of the encoding parameters of the respective simply predetermined region within the data stream, or a residual prediction for the respective predetermined region simply connected for the reconstruction of prediction of the encoding parameters for the predetermined region respective respective simply linked from the encoding parameters of the regions simply linked with the relative relationship of predetermined location to the respective predetermined region simply linked.
[29]
29. Data flow according to claim 28, characterized in that it further comprises if the number of regions that have the relative relationship of predetermined location simply linked to the respective predetermined region simply linked is greater than 1, and, if the coding parameters associated with the simply linked regions are identical to each other, an absence of the coding parameters for the respective predetermined region simply linked or the residual prediction for the respective predetermined region simply linked for the prediction reconstruction of the coding parameters for the respective predetermined region simply linked from the encoding parameters of the number of regions simply linked with the relative ratio of predetermined location to the respective predetermined region
12/12 simply connected.
[30]
30. Data flow, in which a two-dimensional information signal is encoded, characterized by comprising:
a first subset of syntax elements that define a spatial subdivision of 5 a set of samples of information that represent a spatial test of the two-dimensional information signal in a plurality of regions simply linked in different sizes by recursive multi-partitioning and a second subset of elements of syntax being dissociated from the first subset, the definition of a combination of spatially neighboring regions 10 simply linked, of the plurality of regions simply linked to obtain an intermediate subdivision of the sample set of information into dissociated sets of simply linked regions, the union which is the plurality of regions simply linked, and in which the set of samples of information is encoded within the data stream according to the intermediate subdivision.
类似技术:
公开号 | 公开日 | 专利标题
BR122020008236B1|2021-02-17|inheritance in a multitree subdivision arrangement sample
BR112012026393A2|2020-04-14|sample region fusion
BR112012026400B1|2021-08-10|INTER-PLANE PREDICTION
PT2559245E|2015-09-24|Video coding using multi-tree sub-divisions of images
BR112012026383B1|2021-12-07|DECODER, DECODING METHOD, ENCODER, ENCODING METHOD AND DIGITAL STORAGE MEDIA
TWI746414B|2021-11-11|Decoder, encoder, and methods and data stream associated therewith
同族专利:
公开号 | 公开日
KR20190115119A|2019-10-10|
JP6778496B2|2020-11-04|
CN106162173A|2016-11-23|
US20180218397A1|2018-08-02|
US11087355B2|2021-08-10|
CN106162169A|2016-11-23|
KR101754507B1|2017-07-06|
CN106303536A|2017-01-04|
US20200065855A1|2020-02-27|
TWI730420B|2021-06-11|
EP3490257A1|2019-05-29|
TWI605706B|2017-11-11|
TW201828701A|2018-08-01|
TWI713356B|2020-12-11|
US20190197579A1|2019-06-27|
CN106303522B9|2020-01-31|
EP3703369A1|2020-09-02|
KR20200119355A|2020-10-19|
KR101626688B1|2016-06-01|
EP3089450A1|2016-11-02|
US20130039423A1|2013-02-14|
KR101549644B1|2015-09-03|
TWI466548B|2014-12-21|
CN106162172A|2016-11-23|
KR102166519B1|2020-10-16|
JP5911028B2|2016-04-27|
CN102939754A|2013-02-20|
TW202025758A|2020-07-01|
HUE030688T2|2017-05-29|
PL2559246T3|2017-02-28|
TW201517598A|2015-05-01|
CN106162189A|2016-11-23|
TW202114425A|2021-04-01|
CN106162189B|2020-03-24|
US20190164188A1|2019-05-30|
KR20200119353A|2020-10-19|
JP6909818B2|2021-07-28|
JP2020092449A|2020-06-11|
CN106303536B|2021-01-08|
JP6845282B2|2021-03-17|
KR102360005B1|2022-02-08|
JP5718453B2|2015-05-13|
US10460344B2|2019-10-29|
KR102360146B1|2022-02-08|
KR20140074402A|2014-06-17|
TWI678916B|2019-12-01|
TWI675586B|2019-10-21|
CN106101700A|2016-11-09|
KR102166520B1|2020-10-16|
KR20180108896A|2018-10-04|
JP2014195261A|2014-10-09|
CN106162170A|2016-11-23|
CN106162169B|2020-02-28|
CN106101700B|2019-08-13|
US20210304248A1|2021-09-30|
KR20150022020A|2015-03-03|
US20200074503A1|2020-03-05|
EP2559246B1|2016-06-08|
JP2016158259A|2016-09-01|
US20190087857A1|2019-03-21|
ES2590686T3|2016-11-23|
KR20130020890A|2013-03-04|
JP2021101548A|2021-07-08|
WO2011128366A1|2011-10-20|
KR102337744B1|2021-12-09|
US10803485B2|2020-10-13|
CN106210736B|2020-06-16|
US10719850B2|2020-07-21|
EP2559246A1|2013-02-20|
US10672028B2|2020-06-02|
JP2019083583A|2019-05-30|
TW202031043A|2020-08-16|
TW201828703A|2018-08-01|
TW201828702A|2018-08-01|
KR20200019794A|2020-02-24|
CN106210736A|2016-12-07|
US20200320570A1|2020-10-08|
TW201204047A|2012-01-16|
CN106162171B|2020-09-11|
KR20180105277A|2018-09-27|
KR20170078881A|2017-07-07|
US11037194B2|2021-06-15|
US10803483B2|2020-10-13|
KR20200098721A|2020-08-20|
JP2013524709A|2013-06-17|
US10621614B2|2020-04-14|
KR101903919B1|2018-10-02|
JP2019198081A|2019-11-14|
CN106162170B|2019-07-02|
CN106162172B|2020-06-02|
CN106303523A|2017-01-04|
TWI718940B|2021-02-11|
CN102939754B|2016-09-07|
TW202029762A|2020-08-01|
KR102030951B1|2019-10-10|
CN106162171A|2016-11-23|
TW202139705A|2021-10-16|
US20200410532A1|2020-12-31|
US20200019986A1|2020-01-16|
PT2559246T|2016-09-14|
US10748183B2|2020-08-18|
CN106303523B|2020-09-11|
US10248966B2|2019-04-02|
DK2559246T3|2016-09-19|
CN106162173B|2020-02-28|
KR102145722B1|2020-08-20|
US20200372537A1|2020-11-26|
CN106303522B|2019-08-16|
CN106303522A|2017-01-04|
TWI675587B|2019-10-21|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

FR2633468B1|1988-06-24|1990-11-09|France Etat|METHOD OF ENCODING ASSISTANCE DATA FOR THE RECONSTRUCTION OF SUB-SAMPLE ANIMATED ELECTRONIC IMAGES|
US5784631A|1992-06-30|1998-07-21|Discovision Associates|Huffman decoder|
US5809270A|1992-06-30|1998-09-15|Discovision Associates|Inverse quantizer|
US7095783B1|1992-06-30|2006-08-22|Discovision Associates|Multistandard video decoder and decompression system for processing encoded bit streams including start codes and methods relating thereto|
US6408097B1|1993-08-30|2002-06-18|Sony Corporation|Picture coding apparatus and method thereof|
US5446806A|1993-11-15|1995-08-29|National Semiconductor Corporation|Quadtree-structured Walsh transform video/image coding|
CA2145361C|1994-03-24|1999-09-07|Martin William Sotheran|Buffer manager|
WO1997015146A1|1995-10-18|1997-04-24|Philips Electronics N.V.|Method of encoding video images|
US6084908A|1995-10-25|2000-07-04|Sarnoff Corporation|Apparatus and method for quadtree based variable block size motion estimation|
TW346571B|1996-02-06|1998-12-01|Matsushita Electric Ind Co Ltd|Data reception apparatus, data transmission apparatus, information processing system, data reception method|
US6005981A|1996-04-11|1999-12-21|National Semiconductor Corporation|Quadtree-structured coding of color images and intra-coded images|
DE19615493A1|1996-04-19|1997-10-23|Philips Patentverwaltung|Image segmentation method|
US6639945B2|1997-03-14|2003-10-28|Microsoft Corporation|Method and apparatus for implementing motion detection in video compression|
US6057884A|1997-06-05|2000-05-02|General Instrument Corporation|Temporal and spatial scaleable coding for video object planes|
US6269192B1|1997-07-11|2001-07-31|Sarnoff Corporation|Apparatus and method for multiscale zerotree entropy encoding|
CN1882093B|1998-03-10|2011-01-12|索尼公司|Transcoding system using encoding history information|
US6067574A|1998-05-18|2000-05-23|Lucent Technologies Inc|High speed routing using compressed tree process|
US6269175B1|1998-08-28|2001-07-31|Sarnoff Corporation|Method and apparatus for enhancing regions of aligned images using flow estimation|
US6563953B2|1998-11-30|2003-05-13|Microsoft Corporation|Predictive image compression using a single variable length code for both the luminance and chrominance blocks for each macroblock|
US7085319B2|1999-04-17|2006-08-01|Pts Corporation|Segment-based encoding system using segment hierarchies|
JP2000350207A|1999-06-08|2000-12-15|Matsushita Electric Ind Co Ltd|Generalized orthogonal transform method and device for low resolution video decoding|
FI116992B|1999-07-05|2006-04-28|Nokia Corp|Methods, systems, and devices for enhancing audio coding and transmission|
WO2001031497A1|1999-10-22|2001-05-03|Activesky, Inc.|An object oriented video system|
JP3957937B2|1999-12-21|2007-08-15|キヤノン株式会社|Image processing apparatus and method, and storage medium|
US6456905B2|1999-12-22|2002-09-24|Honeywell International, Inc.|Method and apparatus for limiting attitude drift during turns|
AU3027301A|2000-01-21|2001-07-31|Nokia Mobile Phones Ltd|A motion estimation method and a system for a video coder|
FI116819B|2000-01-21|2006-02-28|Nokia Corp|Procedure for transferring images and an image encoder|
US6502920B1|2000-02-04|2003-01-07|Lexmark International, Inc|Ink jet print head having offset nozzle arrays|
US6910001B2|2000-03-22|2005-06-21|Schlumberger Technology Corp.|Distributed multiresolution geometry modeling system and method|
US6785423B1|2000-05-26|2004-08-31|Eastman Kodak Company|Producing a compressed digital image organized into layers having information relating to different viewing conditions and resolutions|
JP2004503964A|2000-06-14|2004-02-05|コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ|Color video encoding and decoding method|
JP2004505520A|2000-07-25|2004-02-19|コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ|Video coding method using wavelet decomposition|
AUPR063400A0|2000-10-06|2000-11-02|Canon Kabushiki Kaisha|Xml encoding scheme|
US7929610B2|2001-03-26|2011-04-19|Sharp Kabushiki Kaisha|Methods and systems for reducing blocking artifacts with reduced complexity for spatially-scalable video coding|
JP2003018602A|2001-04-24|2003-01-17|Monolith Co Ltd|Method and device for encoding and decoding image data|
US6987866B2|2001-06-05|2006-01-17|Micron Technology, Inc.|Multi-modal motion estimation for video sequences|
US7483581B2|2001-07-02|2009-01-27|Qualcomm Incorporated|Apparatus and method for encoding digital image data in a lossless manner|
US7643559B2|2001-09-14|2010-01-05|Ntt Docomo, Inc.|Coding method, decoding method, coding apparatus, decoding apparatus, image processing system, coding program, and decoding program|
US7450641B2|2001-09-14|2008-11-11|Sharp Laboratories Of America, Inc.|Adaptive filtering based upon boundary strength|
US6950469B2|2001-09-17|2005-09-27|Nokia Corporation|Method for sub-pixel value interpolation|
EP1442600B1|2001-10-16|2010-04-28|Koninklijke Philips Electronics N.V.|Video coding method and corresponding transmittable video signal|
WO2003043345A1|2001-11-16|2003-05-22|Ntt Docomo, Inc.|Image encoding method, image decoding method, image encoder, image decode, program, computer data signal, and image transmission system|
US7295609B2|2001-11-30|2007-11-13|Sony Corporation|Method and apparatus for coding image information, method and apparatus for decoding image information, method and apparatus for coding and decoding image information, and system of coding and transmitting image information|
EP1324615A1|2001-12-28|2003-07-02|Deutsche Thomson-Brandt Gmbh|Transcoding MPEG bitstreams for adding sub-picture content|
KR20030065606A|2002-01-30|2003-08-09|양송철|Multi level structured system of bonus accumulation and circulation using individual automatic independent code and its operation|
CN101127899B|2002-04-12|2015-04-01|三菱电机株式会社|Hint information description method|
JP4130780B2|2002-04-15|2008-08-06|松下電器産業株式会社|Image encoding method and image decoding method|
US20030198290A1|2002-04-19|2003-10-23|Dynamic Digital Depth Pty.Ltd.|Image encoding system|
US7433526B2|2002-04-30|2008-10-07|Hewlett-Packard Development Company, L.P.|Method for compressing images and image sequences through adaptive partitioning|
US7154952B2|2002-07-19|2006-12-26|Microsoft Corporation|Timestamp-independent motion vector prediction for predictive and bidirectionally predictive pictures|
AU2003242037A1|2002-07-02|2004-01-23|Matsushita Electric Industrial Co., Ltd.|Image encoding method and image decoding method|
US6975773B1|2002-07-30|2005-12-13|Qualcomm, Incorporated|Parameter selection in data compression and decompression|
US7266247B2|2002-09-30|2007-09-04|Samsung Electronics Co., Ltd.|Image coding method and apparatus using spatial predictive coding of chrominance and image decoding method and apparatus|
JP3950777B2|2002-09-30|2007-08-01|キヤノン株式会社|Image processing method, image processing apparatus, and image processing program|
JP2004135252A|2002-10-09|2004-04-30|Sony Corp|Encoding processing method, encoding apparatus, and decoding apparatus|
US7254533B1|2002-10-17|2007-08-07|Dilithium Networks Pty Ltd.|Method and apparatus for a thin CELP voice codec|
EP1431919B1|2002-12-05|2010-03-03|Samsung Electronics Co., Ltd.|Method and apparatus for encoding and decoding three-dimensional object data by using octrees|
US20070036215A1|2003-03-03|2007-02-15|Feng Pan|Fast mode decision algorithm for intra prediction for advanced video coding|
US7366352B2|2003-03-20|2008-04-29|International Business Machines Corporation|Method and apparatus for performing fast closest match in pattern recognition|
US7643558B2|2003-03-24|2010-01-05|Qualcomm Incorporated|Method, apparatus, and system for encoding and decoding side information for multimedia transmission|
AT336763T|2003-03-28|2006-09-15|Digital Accelerator Corp|TRANSFORMATION BASED REMAINING FRAME MOVEMENT OVERCOMPLETE BASIC CODING PROCESS AND ASSOCIATED VIDEO COMPRESSION DEVICE|
HU0301368A3|2003-05-20|2005-09-28|Amt Advanced Multimedia Techno|Method and equipment for compressing motion picture data|
CN101616329B|2003-07-16|2013-01-02|三星电子株式会社|Video encoding/decoding apparatus and method for color image|
EP1509045A3|2003-07-16|2006-08-09|Samsung Electronics Co., Ltd.|Lossless image encoding/decoding method and apparatus using intercolor plane prediction|
US7010044B2|2003-07-18|2006-03-07|Lsi Logic Corporation|Intra 4×4 modes 3, 7 and 8 availability determination intra estimation and compensation|
FR2858741A1|2003-08-07|2005-02-11|Thomson Licensing Sa|DEVICE AND METHOD FOR COMPRESSING DIGITAL IMAGES|
CN1322472C|2003-09-08|2007-06-20|中国人民解放军第一军医大学|Quad tree image compressing and decompressing method based on wavelet conversion prediction|
JP4677901B2|2003-10-29|2011-04-27|日本電気株式会社|Decoding apparatus or encoding apparatus in which intermediate buffer is inserted between arithmetic code decoder or encoder and inverse binarization converter or binarization converter|
KR20050045746A|2003-11-12|2005-05-17|삼성전자주식회사|Method and device for motion estimation using tree-structured variable block size|
US7418455B2|2003-11-26|2008-08-26|International Business Machines Corporation|System and method for indexing weighted-sequences in large databases|
KR100556911B1|2003-12-05|2006-03-03|엘지전자 주식회사|Video data format for wireless video streaming service|
US7599435B2|2004-01-30|2009-10-06|Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.|Video frame encoding and decoding|
US7649539B2|2004-03-10|2010-01-19|Microsoft Corporation|Image formats for video capture, processing and display|
CN1691087B|2004-04-26|2011-07-06|图形安全系统公司|System and method for decoding digital coding image|
AT555477T|2004-04-28|2012-05-15|Panasonic Corp|MOVIE MOVING STREAM GENERATOR, MOVABLE IMAGE CODING DEVICE, MOBILE PICTURE MULTIPLEX DEVICE, AND MOBILE IMAGE DECODER|
CN1281065C|2004-05-20|2006-10-18|复旦大学|Tree-structure-based grade tree aggregation-divided video image compression method|
US20070230574A1|2004-05-25|2007-10-04|Koninklijke Philips Electronics N.C.|Method and Device for Encoding Digital Video Data|
US20060002474A1|2004-06-26|2006-01-05|Oscar Chi-Lim Au|Efficient multi-block motion estimation for video compression|
CN1812579B|2004-06-27|2010-04-21|苹果公司|Efficient use of storage in encoding and decoding video data stream|
US7292257B2|2004-06-28|2007-11-06|Microsoft Corporation|Interactive viewpoint video system and process|
CN1268136C|2004-07-02|2006-08-02|上海广电(集团)有限公司中央研究院|Frame field adaptive coding method based on image slice structure|
KR100657268B1|2004-07-15|2006-12-14|학교법인 대양학원|Scalable encoding and decoding method of color video, and apparatus thereof|
CN101124589A|2004-08-09|2008-02-13|图形安全系统公司|System and method for authenticating objects using multiple-level encoding and decoding|
CN1589022A|2004-08-26|2005-03-02|中芯联合(北京)微电子有限公司|Macroblock split mode selecting method in multiple mode movement estimation decided by oriented tree|
DE102004059993B4|2004-10-15|2006-08-31|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus and method for generating a coded video sequence using interlayer motion data prediction, and computer program and computer readable medium|
CN101416149A|2004-10-21|2009-04-22|索尼电子有限公司|Supporting fidelity range extensions in advanced video codec file format|
CN1780278A|2004-11-19|2006-05-31|松下电器产业株式会社|Self adaptable modification and encode method and apparatus in sub-carrier communication system|
US20060120454A1|2004-11-29|2006-06-08|Park Seung W|Method and apparatus for encoding/decoding video signal using motion vectors of pictures in base layer|
KR100703734B1|2004-12-03|2007-04-05|삼성전자주식회사|Method and apparatus for encoding/decoding multi-layer video using DCT upsampling|
WO2006058921A1|2004-12-03|2006-06-08|Thomson Licensing|Method for scalable video coding|
KR101138392B1|2004-12-30|2012-04-26|삼성전자주식회사|Color image encoding and decoding method and apparatus using a correlation between chrominance components|
US7970219B2|2004-12-30|2011-06-28|Samsung Electronics Co., Ltd.|Color image encoding and decoding method and apparatus using a correlation between chrominance components|
US20060153300A1|2005-01-12|2006-07-13|Nokia Corporation|Method and system for motion vector prediction in scalable video coding|
US20060153295A1|2005-01-12|2006-07-13|Nokia Corporation|Method and system for inter-layer prediction mode coding in scalable video coding|
CN101213840B|2005-02-18|2011-02-02|汤姆森许可贸易公司|Method for deriving coding information for high resolution pictures from low resolution pictures and coding and decoding devices implementing said method|
CN101204092B|2005-02-18|2010-11-03|汤姆森许可贸易公司|Method for deriving coding information for high resolution images from low resolution images and coding and decoding devices implementing said method|
JP4504230B2|2005-03-02|2010-07-14|株式会社東芝|Moving image processing apparatus, moving image processing method, and moving image processing program|
TWI259727B|2005-03-09|2006-08-01|Sunplus Technology Co Ltd|Method for rapidly determining macroblock mode|
US7961963B2|2005-03-18|2011-06-14|Sharp Laboratories Of America, Inc.|Methods and systems for extended spatial scalability with picture-level adaptation|
EP1711018A1|2005-04-08|2006-10-11|Thomson Licensing|Method and apparatus for encoding video pictures, and method and apparatus for decoding video pictures|
US20060233262A1|2005-04-13|2006-10-19|Nokia Corporation|Signaling of bit stream ordering in scalable video coding|
KR101246915B1|2005-04-18|2013-03-25|삼성전자주식회사|Method and apparatus for encoding or decoding moving picture|
KR100746007B1|2005-04-19|2007-08-06|삼성전자주식회사|Method and apparatus for adaptively selecting context model of entrophy coding|
KR100763181B1|2005-04-19|2007-10-05|삼성전자주식회사|Method and apparatus for improving coding rate by coding prediction information from base layer and enhancement layer|
EP1880364A1|2005-05-12|2008-01-23|Bracco Imaging S.P.A.|Method for coding pixels or voxels of a digital image and a method for processing digital images|
EP1908292A4|2005-06-29|2011-04-27|Nokia Corp|Method and apparatus for update step in video coding using motion compensated temporal filtering|
JP4444180B2|2005-07-20|2010-03-31|株式会社東芝|Texture encoding apparatus, texture decoding apparatus, method, and program|
RU2406253C2|2005-07-21|2010-12-10|Томсон Лайсенсинг|Method and device for weighted prediction for scalable video signal coding|
CA2732532C|2005-07-22|2013-08-20|Mitsubishi Electric Corporation|Image decoder that decodes a color image signal and related method|
US9113147B2|2005-09-27|2015-08-18|Qualcomm Incorporated|Scalability techniques based on content information|
WO2007036759A1|2005-09-29|2007-04-05|Telecom Italia S.P.A.|Method for scalable video coding|
WO2007047271A2|2005-10-12|2007-04-26|Thomson Licensing|Methods and apparatus for weighted prediction in scalable video encoding and decoding|
KR100763196B1|2005-10-19|2007-10-04|삼성전자주식회사|Method for coding flags in a layer using inter-layer correlation, method for decoding the coded flags, and apparatus thereof|
EP1946563A2|2005-10-19|2008-07-23|Thomson Licensing|Multi-view video coding using scalable video coding|
JP2007135252A|2005-11-08|2007-05-31|Hitachi Ltd|Power converter|
KR100873636B1|2005-11-14|2008-12-12|삼성전자주식회사|Method and apparatus for encoding/decoding image using single coding mode|
RU2340114C1|2005-11-18|2008-11-27|Сони Корпорейшн|Coding device and method, decoding device and method and transmission system|
KR100717055B1|2005-11-18|2007-05-10|삼성전자주식회사|Method of decoding bin values using pipeline architecture, and decoding apparatus therefor|
GB0600141D0|2006-01-05|2006-02-15|British Broadcasting Corp|Scalable coding of video signals|
WO2007077116A1|2006-01-05|2007-07-12|Thomson Licensing|Inter-layer motion prediction method|
KR20070074451A|2006-01-09|2007-07-12|엘지전자 주식회사|Method for using video signals of a baselayer for interlayer prediction|
EP1977607A4|2006-01-09|2014-12-17|Lg Electronics Inc|Inter-layer prediction method for video signal|
US8315308B2|2006-01-11|2012-11-20|Qualcomm Incorporated|Video coding with fine granularity spatial scalability|
US8861585B2|2006-01-20|2014-10-14|Qualcomm Incorporated|Method and apparatus for error resilience algorithms in wireless video communication|
US7929608B2|2006-03-28|2011-04-19|Sony Corporation|Method of reducing computations in intra-prediction and mode decision processes in a digital video encoder|
CN101416399B|2006-03-31|2013-06-19|英特尔公司|Layered decoder and method for implementing layered decode|
CN101047733B|2006-06-16|2010-09-29|华为技术有限公司|Short message processing method and device|
KR101526914B1|2006-08-02|2015-06-08|톰슨 라이센싱|Methods and apparatus for adaptive geometric partitioning for video decoding|
US20080086545A1|2006-08-16|2008-04-10|Motorola, Inc.|Network configuration using configuration parameter inheritance|
CN101507280B|2006-08-25|2012-12-26|汤姆逊许可公司|Methods and apparatus for reduced resolution partitioning|
CN102158697B|2006-09-07|2013-10-09|Lg电子株式会社|Method and apparatus for decoding/encoding of a video signal|
CN100471275C|2006-09-08|2009-03-18|清华大学|Motion estimating method for H.264/AVC coder|
CN100486336C|2006-09-21|2009-05-06|上海大学|Real time method for segmenting motion object based on H.264 compression domain|
US9014280B2|2006-10-13|2015-04-21|Qualcomm Incorporated|Video coding with adaptive filtering for motion compensated prediction|
WO2008049052A2|2006-10-18|2008-04-24|Apple Inc.|Scalable video coding with filtering of lower layers|
US7775002B2|2006-11-10|2010-08-17|John Puchniak|Portable hurricane and security window barrier|
CN101395921B|2006-11-17|2012-08-22|Lg电子株式会社|Method and apparatus for decoding/encoding a video signal|
EP1933564A1|2006-12-14|2008-06-18|Thomson Licensing|Method and apparatus for encoding and/or decoding video data using adaptive prediction order for spatial and bit depth prediction|
BRPI0721077A2|2006-12-28|2014-07-01|Nippon Telegraph & Telephone|CODING METHOD AND VIDEO DECODING METHOD, SAME APPARELS, SAME PROGRAMS, AND STORAGE Means WHICH STORE THE PROGRAMS|
WO2008084423A1|2007-01-08|2008-07-17|Nokia Corporation|Improved inter-layer prediction for extended spatial scalability in video coding|
CN101018333A|2007-02-09|2007-08-15|上海大学|Coding method of fine and classified video of space domain classified noise/signal ratio|
WO2008154041A1|2007-06-14|2008-12-18|Thomson Licensing|Modifying a coded bitstream|
JP2010135863A|2007-03-28|2010-06-17|Toshiba Corp|Method and device for encoding image|
BRPI0809512A2|2007-04-12|2016-03-15|Thomson Licensing|context-dependent merge method and apparatus for direct jump modes for video encoding and decoding|
KR20080093386A|2007-04-16|2008-10-21|한국전자통신연구원|Color video scalability encoding and decoding method and device thereof|
TW200845723A|2007-04-23|2008-11-16|Thomson Licensing|Method and apparatus for encoding video data, method and apparatus for decoding encoded video data and encoded video signal|
CN100515087C|2007-05-30|2009-07-15|威盛电子股份有限公司|Method and device for determining whether or not two adjacent macro zone block locate on same banded zone|
KR100906243B1|2007-06-04|2009-07-07|전자부품연구원|Video coding method of rgb color space signal|
CN100496129C|2007-06-05|2009-06-03|南京大学|H.264 based multichannel video transcoding multiplexing method|
JP2008311781A|2007-06-12|2008-12-25|Ntt Docomo Inc|Motion picture encoder, motion picture decoder, motion picture encoding method, motion picture decoding method, motion picture encoding program and motion picture decoding program|
BRPI0810517A2|2007-06-12|2014-10-21|Thomson Licensing|METHODS AND APPARATUS SUPPORTING MULTIPASS VIDEO SYNTAX STRUCTURE FOR SECTION DATA|
JP4551948B2|2007-06-13|2010-09-29|シャープ株式会社|Linear light source device, surface light emitting device, planar light source device, and liquid crystal display device|
US8428133B2|2007-06-15|2013-04-23|Qualcomm Incorporated|Adaptive coding of video block prediction mode|
US8085852B2|2007-06-26|2011-12-27|Mitsubishi Electric Research Laboratories, Inc.|Inverse tone mapping for bit-depth scalable image coding|
US8422803B2|2007-06-28|2013-04-16|Mitsubishi Electric Corporation|Image encoding device, image decoding device, image encoding method and image decoding method|
CN100534186C|2007-07-05|2009-08-26|西安电子科技大学|JPEG2000 self-adapted rate control system and method based on pre-allocated code rate|
US8458612B2|2007-07-29|2013-06-04|Hewlett-Packard Development Company, L.P.|Application management framework for web applications|
CN101119493B|2007-08-30|2010-12-01|威盛电子股份有限公司|Coding method and device for block type digital coding image|
KR20090030681A|2007-09-20|2009-03-25|삼성전자주식회사|Image processing apparatus, display apparatus, display system and control method thereof|
US8374446B2|2007-09-28|2013-02-12|Vsevolod Yurievich Mokrushin|Encoding and decoding of digital signals based on compression of hierarchical pyramid|
KR101403343B1|2007-10-04|2014-06-09|삼성전자주식회사|Method and apparatus for inter prediction encoding/decoding using sub-pixel motion estimation|
BRPI0818344A2|2007-10-12|2015-04-22|Thomson Licensing|Methods and apparatus for encoding and decoding video of geometrically partitioned bi-predictive mode partitions|
US7777654B2|2007-10-16|2010-08-17|Industrial Technology Research Institute|System and method for context-based adaptive binary arithematic encoding and decoding|
BRPI0818649A2|2007-10-16|2015-04-07|Thomson Licensing|Methods and apparatus for encoding and decoding video in geometrically partitioned superblocks.|
CN101415149B|2007-10-19|2010-12-08|华为技术有限公司|Method and apparatus for improving BC business|
GB2454195A|2007-10-30|2009-05-06|Sony Corp|Address generation polynomial and permutation matrix for DVB-T2 16k OFDM sub-carrier mode interleaver|
CN101676744B|2007-10-31|2012-07-11|北京航空航天大学|Method for tracking small target with high precision under complex background and low signal-to-noise ratio|
US8270472B2|2007-11-09|2012-09-18|Thomson Licensing|Methods and apparatus for adaptive reference filtering of bi-predictive pictures in multi-view coded video|
US8540158B2|2007-12-12|2013-09-24|Yiwu Lei|Document verification using dynamic document identification framework|
US20090154567A1|2007-12-13|2009-06-18|Shaw-Min Lei|In-loop fidelity enhancement for video compression|
US20090165041A1|2007-12-21|2009-06-25|Penberthy John S|System and Method for Providing Interactive Content with Video Content|
WO2009083926A2|2007-12-28|2009-07-09|Nxp B.V.|Arrangement and approach for image data processing|
US8126054B2|2008-01-09|2012-02-28|Motorola Mobility, Inc.|Method and apparatus for highly scalable intraframe video coding|
EP2232875A2|2008-01-11|2010-09-29|Thomson Licensing|Video and depth coding|
US8155184B2|2008-01-16|2012-04-10|Sony Corporation|Video coding system using texture analysis and synthesis in a scalable coding framework|
AT524927T|2008-01-21|2011-09-15|Ericsson Telefon Ab L M|PRESENTATION BASED IMAGE PROCESSING|
EP2245596B1|2008-01-21|2017-07-12|Telefonaktiebolaget LM Ericsson |Prediction-based image processing|
KR101291196B1|2008-01-25|2013-07-31|삼성전자주식회사|Video encoding method and apparatus, and video decoding method and apparatus|
US8711948B2|2008-03-21|2014-04-29|Microsoft Corporation|Motion-compensated prediction of inter-layer residuals|
US8179974B2|2008-05-02|2012-05-15|Microsoft Corporation|Multi-level representation of reordered transform coefficients|
US20100220469A1|2008-05-23|2010-09-02|Altair Engineering, Inc.|D-shaped cross section l.e.d. based light|
TWI373959B|2008-06-09|2012-10-01|Kun Shan University Of Technology|Wavelet codec with a function of adjustable image quality|
KR101517768B1|2008-07-02|2015-05-06|삼성전자주식회사|Method and apparatus for encoding video and method and apparatus for decoding video|
US8406307B2|2008-08-22|2013-03-26|Microsoft Corporation|Entropy coding/decoding of hierarchically organized data|
US8750379B2|2008-09-11|2014-06-10|General Instrument Corporation|Method and apparatus for complexity-scalable motion estimation|
JP5422168B2|2008-09-29|2014-02-19|株式会社日立製作所|Video encoding method and video decoding method|
US8634456B2|2008-10-03|2014-01-21|Qualcomm Incorporated|Video coding with large macroblocks|
US8503527B2|2008-10-03|2013-08-06|Qualcomm Incorporated|Video coding with large macroblocks|
US8619856B2|2008-10-03|2013-12-31|Qualcomm Incorporated|Video coding with large macroblocks|
US20100086031A1|2008-10-03|2010-04-08|Qualcomm Incorporated|Video coding with large macroblocks|
CN101404774B|2008-11-13|2010-06-23|四川虹微技术有限公司|Macro-block partition mode selection method in movement search|
JP5001964B2|2009-02-18|2012-08-15|株式会社エヌ・ティ・ティ・ドコモ|Image coding apparatus, method and program, and image decoding apparatus, method and program|
CN101493890B|2009-02-26|2011-05-11|上海交通大学|Dynamic vision caution region extracting method based on characteristic|
US8810562B2|2009-05-19|2014-08-19|Advanced Micro Devices, Inc.|Hierarchical lossless compression|
US8395708B2|2009-07-21|2013-03-12|Qualcomm Incorporated|Method and system for detection and enhancement of video images|
KR101456498B1|2009-08-14|2014-10-31|삼성전자주식회사|Method and apparatus for video encoding considering scanning order of coding units with hierarchical structure, and method and apparatus for video decoding considering scanning order of coding units with hierarchical structure|
EP2485490B1|2009-10-01|2015-09-30|SK Telecom Co., Ltd.|Method and apparatus for encoding/decoding image using split layer|
KR101457418B1|2009-10-23|2014-11-04|삼성전자주식회사|Method and apparatus for video encoding and decoding dependent on hierarchical structure of coding unit|
US8594200B2|2009-11-11|2013-11-26|Mediatek Inc.|Method of storing motion vector information and video decoding apparatus|
JP5475409B2|2009-11-20|2014-04-16|三菱電機株式会社|Moving picture coding apparatus and moving picture coding method|
WO2011063397A1|2009-11-23|2011-05-26|General Instrument Corporation|Depth coding as an additional channel to video sequence|
US8315310B2|2010-01-08|2012-11-20|Research In Motion Limited|Method and device for motion vector prediction in video transcoding using full resolution residuals|
US20110170608A1|2010-01-08|2011-07-14|Xun Shi|Method and device for video transcoding using quad-tree based mode selection|
KR101750046B1|2010-04-05|2017-06-22|삼성전자주식회사|Method and apparatus for video encoding with in-loop filtering based on tree-structured data unit, method and apparatus for video decoding with the same|
KR101529992B1|2010-04-05|2015-06-18|삼성전자주식회사|Method and apparatus for video encoding for compensating pixel value of pixel group, method and apparatus for video decoding for the same|
KR101847072B1|2010-04-05|2018-04-09|삼성전자주식회사|Method and apparatus for video encoding, and method and apparatus for video decoding|
US20110249743A1|2010-04-09|2011-10-13|Jie Zhao|Super-block for high performance video coding|
TWI678916B|2010-04-13|2019-12-01|美商Ge影像壓縮有限公司|Sample region merging|
CN106060558B|2010-04-13|2019-08-13|Ge视频压缩有限责任公司|Decoder, the method for rebuilding array, encoder, coding method|
BR122020007923B1|2010-04-13|2021-08-03|Ge Video Compression, Llc|INTERPLANE PREDICTION|
DK2559245T3|2010-04-13|2015-08-24|Ge Video Compression Llc|Video Coding using multitræsunderinddeling Images|
KR20110135471A|2010-06-11|2011-12-19|휴맥스|Apparatuses and methods for encoding/decoding of video using block merging|
KR102277273B1|2010-10-08|2021-07-15|지이 비디오 컴프레션, 엘엘씨|Picture coding supporting block partitioning and block merging|
KR101527666B1|2010-11-04|2015-06-09|프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베.|Picture coding supporting block merging and skip mode|
US20120170648A1|2011-01-05|2012-07-05|Qualcomm Incorporated|Frame splitting in video coding|
PT3471415T|2011-06-16|2021-11-04|Ge Video Compression Llc|Entropy coding of motion vector differences|
CN106660506B|2014-07-22|2019-08-16|奥托立夫开发公司|Side air bag device|BR122020007923B1|2010-04-13|2021-08-03|Ge Video Compression, Llc|INTERPLANE PREDICTION|
TWI678916B|2010-04-13|2019-12-01|美商Ge影像壓縮有限公司|Sample region merging|
CN106060558B|2010-04-13|2019-08-13|Ge视频压缩有限责任公司|Decoder, the method for rebuilding array, encoder, coding method|
DK2559245T3|2010-04-13|2015-08-24|Ge Video Compression Llc|Video Coding using multitræsunderinddeling Images|
KR101791242B1|2010-04-16|2017-10-30|에스케이텔레콤 주식회사|Video Coding and Decoding Method and Apparatus|
KR101791078B1|2010-04-16|2017-10-30|에스케이텔레콤 주식회사|Video Coding and Decoding Method and Apparatus|
JP2011259093A|2010-06-07|2011-12-22|Sony Corp|Image decoding apparatus and image encoding apparatus and method and program therefor|
DK2858366T3|2010-07-09|2017-02-13|Samsung Electronics Co Ltd|Method of decoding video using block merge|
KR101484281B1|2010-07-09|2015-01-21|삼성전자주식회사|Method and apparatus for video encoding using block merging, method and apparatus for video decoding using block merging|
US20130215968A1|2010-10-28|2013-08-22|University-Industry Cooperation Group Of Kyung Hee University|Video information encoding method and decoding method|
EP2719176B1|2011-06-13|2021-07-07|Dolby Laboratories Licensing Corporation|Visual display resolution prediction based on fused regions|
KR101753551B1|2011-06-20|2017-07-03|가부시키가이샤 제이브이씨 켄우드|Image encoding device, image encoding method and recording medium storing image encoding program|
CN102291583A|2011-09-29|2011-12-21|中航华东光电有限公司|Image coding device and image coding method thereof|
US9807401B2|2011-11-01|2017-10-31|Qualcomm Incorporated|Transform unit partitioning for chroma components in video coding|
EP2942961A1|2011-11-23|2015-11-11|HUMAX Holdings Co., Ltd.|Methods for encoding/decoding of video using common merging candidate set of asymmetric partitions|
KR101909544B1|2012-01-19|2018-10-18|삼성전자주식회사|Apparatus and method for plane detection|
US9866829B2|2012-01-22|2018-01-09|Qualcomm Incorporated|Coding of syntax elements that correspond to coefficients of a coefficient block in video coding|
EP2629156A1|2012-01-27|2013-08-21|Samsung Electronics Co., Ltd|Image processing apparatus and method|
CN103389879B|2012-05-10|2016-08-17|慧荣科技股份有限公司|Electronic installation and the method being transferred data to display device by electronic installation|
US9332266B2|2012-08-24|2016-05-03|Industrial Technology Research Institute|Method for prediction in image encoding and image encoding apparatus applying the same|
GB2505408A|2012-08-24|2014-03-05|British Broadcasting Corp|Video Encoding and Decoding with Chrominance Sub-sampling|
TW201419863A|2012-11-13|2014-05-16|Hon Hai Prec Ind Co Ltd|System and method for splitting an image|
TW201419865A|2012-11-13|2014-05-16|Hon Hai Prec Ind Co Ltd|System and method for splitting an image|
TW201419862A|2012-11-13|2014-05-16|Hon Hai Prec Ind Co Ltd|System and method for splitting an image|
JP5719401B2|2013-04-02|2015-05-20|日本電信電話株式会社|Block size determination method, video encoding device, and program|
KR101749855B1|2013-04-05|2017-06-21|미쓰비시덴키 가부시키가이샤|Color image encoding apparatus, color image decoding apparatus, color image encoding method, and color image decoding method|
US9686561B2|2013-06-17|2017-06-20|Qualcomm Incorporated|Inter-component filtering|
US9716899B2|2013-06-27|2017-07-25|Qualcomm Incorporated|Depth oriented inter-view motion vector prediction|
US20150063455A1|2013-09-02|2015-03-05|Humax Holdings Co., Ltd.|Methods and apparatuses for predicting depth quadtree in three-dimensional video|
US20150089374A1|2013-09-20|2015-03-26|Cyan Inc.|Network visualization system and method|
US10075266B2|2013-10-09|2018-09-11|Qualcomm Incorporated|Data transmission scheme with unequal code block sizes|
US20150103883A1|2013-10-11|2015-04-16|Mediatek Inc.|Method and apparatus for fast intra prediction|
EP3846469A1|2013-10-18|2021-07-07|GE Video Compression, LLC|Multi-component picture or video coding concept|
US20150189269A1|2013-12-30|2015-07-02|Google Inc.|Recursive block partitioning|
CA2946779C|2014-05-05|2019-10-01|Mediatek Singapore Pte. Ltd.|Method and apparatus for determining residue transform tree representation|
CN112087630A|2014-09-30|2020-12-15|华为技术有限公司|Image prediction method and related device|
CN105828080B|2015-01-26|2020-02-14|同济大学|Image coding and decoding method and device|
CN109005407A|2015-05-15|2018-12-14|华为技术有限公司|Encoding video pictures and decoded method, encoding device and decoding device|
CN106358042B|2015-07-17|2020-10-09|恩智浦美国有限公司|Parallel decoder using inter-prediction of video images|
US10560713B2|2015-09-24|2020-02-11|Lg Electronics Inc.|Method and apparatus for motion vector refinement-based inter prediction in image coding system|
CN105357530B|2015-10-16|2018-06-19|广州市百果园网络科技有限公司|A kind of method and device of predictive coding|
US10972731B2|2015-11-10|2021-04-06|Interdigital Madison Patent Holdings, Sas|Systems and methods for coding in super-block based video coding framework|
US10469841B2|2016-01-29|2019-11-05|Google Llc|Motion vector prediction using prior frame residual|
US10306258B2|2016-01-29|2019-05-28|Google Llc|Last frame motion vector partitioning|
US10609423B2|2016-09-07|2020-03-31|Qualcomm Incorporated|Tree-type coding for video coding|
US10110914B1|2016-09-15|2018-10-23|Google Llc|Locally adaptive warped motion compensation in video coding|
WO2018131986A1|2017-01-16|2018-07-19|세종대학교 산학협력단|Image encoding/decoding method and device|
EP3358754A1|2017-02-02|2018-08-08|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Antenna array codebook with beamforming coefficients adapted to an arbitrary antenna response of the antenna array|
US10430104B2|2017-05-08|2019-10-01|International Business Machines Corporation|Distributing data by successive spatial partitionings|
WO2019136699A1|2018-01-12|2019-07-18|Oppo广东移动通信有限公司|Information transmission method and device|
KR20200124755A|2018-04-13|2020-11-03|엘지전자 주식회사|Inter prediction method and apparatus in video processing system|
US11128871B2|2018-04-25|2021-09-21|Panasonic Intellectual Property Corporation Of America|Encoder for adaptively determining information related to splitting based on characteristics of neighboring samples|
US10593097B2|2018-05-08|2020-03-17|Qualcomm Technologies, Inc.|Distributed graphics processing|
CN109035267B|2018-06-22|2021-07-27|华东师范大学|Image target matting method based on deep learning|
JP2020065143A|2018-10-16|2020-04-23|セイコーエプソン株式会社|Image processing apparatus, method for controlling image processing apparatus, and display unit|
CN113875255A|2019-03-21|2021-12-31|Sk电信有限公司|Method for recovering in units of sub-blocks and image decoding apparatus|
EP3967035A1|2019-05-10|2022-03-16|Beijing Dajia Internet Information Technology Co., Ltd.|Methods and apparatuses for video coding with triangle prediction|
WO2021136821A1|2019-12-30|2021-07-08|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Encoding and decoding of color components in pictures|
WO2021222038A1|2020-04-27|2021-11-04|Bytedance Inc.|Sublayers information in video coding|
法律状态:
2020-05-26| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2020-05-26| B15K| Others concerning applications: alteration of classification|Free format text: AS CLASSIFICACOES ANTERIORES ERAM: H04N 7/26 , H04N 7/50 Ipc: H04N 19/105 (2014.01), H04N 19/119 (2014.01), H04N |
2020-06-02| B25A| Requested transfer of rights approved|Owner name: GE VIDEO COMPRESSION, LLC (US) |
2021-10-19| B350| Update of information on the portal [chapter 15.35 patent gazette]|
2021-12-21| B07A| Application suspended after technical examination (opinion) [chapter 7.1 patent gazette]|
优先权:
申请号 | 申请日 | 专利标题
PCT/EP2010/054833|WO2011127963A1|2010-04-13|2010-04-13|Sample region merging|
EP10159799|2010-04-13|
PCT/EP2011/055795|WO2011128366A1|2010-04-13|2011-04-13|Sample region merging|
[返回顶部]