专利摘要:
computer program equipment, method, and product for transforming instruction specifiers of a computing environment. emulating statements that include non-contiguous specifiers becomes easier. the noncontiguous specifier specifies a resource of an instruction, such as a record, using multiple fields of the instruction. for example, multiple fields of the instruction (e.g., two fields) include bits that together designate a particular register to be used by the instruction. non-contiguous specifiers of instructions defined in one architecture of a computer system or equipment are transformed into contiguous specifiers usable by instructions defined in another architecture of a computer system. the instructions defined in the other computer system architecture emulate the instructions defined for the first computer system architecture.
公开号:BR112014022638B1
申请号:R112014022638-5
申请日:2012-11-15
公开日:2022-01-04
发明作者:Michael Karl Gschwind
申请人:International Business Machines Corporation;
IPC主号:
专利说明:

01. The invention relates, in general, to emulation in a computational environment and, in particular, to the emulation of specifiers within instructions.
02. Emulation imitates functions in a computer architecture, designated as meta architecture. The meta architecture differs from a computer architecture called the source architecture, for which the functions have been defined. For example, an instruction written for Architecture/z provided by International Business Machines Corporation, Armonk, New York, may be translated and represented as one or more instructions from a different architecture, such as PowerPC, also provided by International Business Machines Corporation, or other architecture offered by International Business Machines Corporation or another company. These translated instructions perform the same or similar functions as those being translated.
03. There are different types of emulation, including interpretation and translation. With interpretation, the data representing an instruction is read and each instruction is executed as it is decoded. Each instruction is executed each time it is referenced. However, with translation, also called binary translation or recompilation, sequences of instructions are translated from the instruction set of one computer architecture to the instruction set of another computer architecture.
04. There are multiple types of translation, including static translation and dynamic translation. In static translation, code from an instruction on one architecture is converted to code that runs on the other architecture without executing the code first. In contrast, in dynamic translation, at least one section of code is executed and translated, and the result is cached for subsequent execution by a processor of the meta computer architecture. SUMMARY OF THE INVENTION
05. Prior art failures are addressed and advantages are gained by providing a computer program product to transform instruction specifiers of a computing environment. The computer program product includes computer readable storage media readable by a processor circuit and storage instructions for execution by the processor circuit to perform a method including: obtaining by a processor a first instruction defined for the first architecture a non-contiguous specifier, which specifier has a first part and a second part, where obtaining includes obtaining the first part from a first field of the instruction and the second part from a second field of the instruction, the first field separated from the second field; generating a contiguous specifier using the first part and the second part, such generation using one or more rules based on the opcode of the first instruction; and using the contiguous specifier to indicate a resource to be used in executing a second instruction, that second instruction being defined for a second computer architecture different from the first computer architecture and emulating a function of the first instruction.
06. Methods and systems relating to one or more aspects of the present invention are also described and claimed herein. In addition, services relating to one or more aspects of the present invention are also described and may be claimed herein.
07. Additional features and advantages are understood through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered part of the claimed invention. BRIEF DESCRIPTION OF THE DRAWINGS
08. Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings in which:
09. FIG. 1 depicts an example of a computing environment for incorporating and using one or more aspects of the present invention;
010. FIG. 2 depicts additional details of the memory of FIG. 1, according to one aspect of the present invention;
011. FIG. 3 depicts an embodiment of an overview of the emulation process that employs one or more interpretations and translations;
012. FIG. 4 depicts an example of logic associated with the interpretation block referenced in FIG. 3;
013. FIG. 5 depicts an example of logic associated with the translation block referenced in FIG. 3;
014. FIG. 6 depicts another embodiment of an overview of an emulation process employing one or more modified interpretations and translations in accordance with an aspect of the present invention;
015. FIG. 7A depicts an example of logic associated with the interpretation block referenced in FIG. 6, according to one aspect of the present invention;
016. FIG. 7B depicts an embodiment of logic for transforming a non-contiguous specifier into a contiguous specifier, in accordance with an aspect of the present invention;
017. FIG. 8 depicts an example of logic associated with the translation block referenced in FIG. 6, according to one aspect of the present invention;
018. FIG. 9A depicts an embodiment of transforming a non-contiguous specifier in a Load Vector instruction of one computer architecture to a contiguous specifier in a Load Vector Indexed instruction of another computer architecture, in accordance with an aspect of the present invention;
019. FIG. 9B depicts another example of the transformation of FIG. 9A, including allocating a given record to the contiguous specifier, in accordance with an aspect of the present invention;
020. FIG. 10 depicts an example of a log file, in accordance with an aspect of the present invention;
021. FIG. 11 depicts an example of transforming non-contiguous specifiers into contiguous specifiers in memory allocation during emulation, in accordance with an aspect of the present invention;
022. FIG. 12 depicts an embodiment of a computer program product incorporating one or more aspects of the present invention;
023. FIG. 13 depicts an embodiment of a host computer system for incorporating and using one or more aspects of the present invention;
024. FIG. 14 depicts a further example of a computer system for incorporating and using one or more aspects of the present invention;
025. FIG. 15 depicts another example of a computer system comprising a network of computers for incorporating and using one or more aspects of the present invention;
026. FIG. 16 depicts an embodiment of various elements of a computing system for incorporating and using one or more aspects of the present invention;
027. FIG. 17A depicts an embodiment of the computer system execution unit of FIG. 16 to incorporate and use one or more aspects of the present invention;
028. FIG. 17B depicts an embodiment of the computer system bypass unit of FIG. 16 to incorporate and use one or more aspects of the present invention;
029. FIG. 17C depicts an embodiment of the load/storage unit of the computing system of FIG. 16 to incorporate and use one or more aspects of the present invention; and
030. FIG. 18 depicts an embodiment of an emulated host computer system to incorporate and utilize one or more aspects of the present invention. DETAILED DESCRIPTION
031. In accordance with one aspect of the present invention, a technique is provided for facilitating the emulation of instructions that include non-contiguous specifiers. The non-contiguous specifier specifies an instruction resource, such as a record, using multiple fields of the instruction. For example, multiple instruction fields (e.g., two fields) include bits that together designate a particular register to be used by the instruction.
032. In a particular aspect of the invention, a technique is provided for transforming non-contiguous instruction specifiers defined in a computer system architecture (eg, Architecture/z provided by International Business Machines Corporation) into contiguous specifiers usable by defined instructions and other computer architecture. computer system (eg, PowerPC architecture offered by International Business Machines Corporation). Instructions defined in one computer system architecture emulate instructions defined for the other computer system architecture.
033. An embodiment of a computing environment providing emulation is described with reference to FIG. 1. In one embodiment, the computing environment 100 includes, for example, a native central processing unit 102, memory 104, and one or more input/output devices and/or interfaces 106 coupled together, for example, via a or more buses 108 and/or other connections. By way of example, the computing environment 100 may include a PowerPC processor, a pSeries server or an xSeries server, from International Business Machines Corporation, Armonk, New York; an HP Superdome with Intel Itanium II processors from Hewlett Packard Co., Palo Alto, California; and/or other machines based on architectures offered by International Business Machines Corporation, Hewlett Packard, Intel, Oracle, or others.
034. The native central processing unit 102 includes one or more native registers 110, such as one or more general purpose registers and/or one or more specific purpose registers used during processing within the environment. These records include information that represents the state of the environment at a given point in time.
035. In addition, the native central processing unit 102 executes instructions and codes that are stored in the memory 104. In the specific example, the central processing unit executes the emulator code 112 stored in the memory 104. This code activates the configured processing environment on one architecture to emulate another architecture. For example, emulator code 112 allows machines based on architectures other than the /z Architecture, such as PowerPC processors, pSeries servers, xSeries servers, HP Superdome servers, and others, to emulate the /z Architecture and run software and instructions built on the Architecture. /z.
036. Additional details regarding emulator code 112 are described with reference to FIG. 2. Guest instructions 200 comprise software instructions (eg, machine instructions) that were designed to run on a different architecture than the native CPU 102. For example, guest instructions 200 may have been designed to run on an Architecture/ z, but instead they are being emulated on the native CPU 102, which could be either a PowerPC processor or another type of processor. In one example, emulator code 112 includes an instruction fetch unit 202 to retrieve one or more guest instructions 200 from memory 104 and optionally provide local protection for the fetched instructions. This also includes an instruction translation routine 204 for determining the type of guest instruction that has been obtained and for translating the guest instruction into one or more native instructions 206. This translation includes, for example, identifying the function to be performed by the instruction. (eg, via opcode (operational code)) and choice of native instruction(s) to execute that function.
037. In addition, the emulator 112 includes an emulation control routine 210 to cause native instructions to be executed. The emulation control routine 210 may cause the native CPU 102 to execute a routine of native instructions that emulate one or more previously obtained guest instructions and, at the end of that execution, return control to the instruction fetch routine to emulate the obtaining the next invited instruction or a group of invited instructions. Execution of native instructions 206 may include loading data into a register from memory 104; storing the data back into memory from the registry; or performing some sort of arithmetic or logical operation, as determined by the translation routine.
038. Each routine, for example, is implemented in software, which is stored in memory and executed by a native central processing unit 102. In other models, one or more routines or operations are implemented in firmware, hardware, software, or a combination their. The emulated processor registers may be emulated using native CPU registers 110 or by using memory locations 104. In embodiments, guest instructions 200, native instructions 206, or emulator code 112 may reside in the same memory or may be in different memory devices.
039. As used in this document, firmware includes, e.g., the microcode, millipede and/or macrocode of the processor. It includes, for example, the hardware-level instructions and/or data structures used in implementing higher-level machine code. In one embodiment, it includes, for example, proprietary code that is typically delivered as microcode that includes trusted software or microcode specific to the underlying hardware and controls the operating system's access to the hardware system.
040. In one example, a guest instruction 200 that is retrieved, translated, and executed is one or more of the instructions described in this document. The instruction from one architecture (e.g., Architecture/z) is fetched from memory, translated, and represented as a sequence of native instructions 206 from another architecture (e.g., PowerPC, pSeries, xSeries, Intel, etc.). These native instructions are then executed.
041. Additional details regarding emulation are described with reference to FIGs. 3-5. FIG. 3, in particular, depicts an embodiment of an overview of an emulation process that employs one or more interpretations and translations; FIG. 4 depicts an embodiment of the logic associated with the interpretation referenced in FIG. 3 (Technical 2000); and FIG. 5 depicts an embodiment of the logic associated with the binary translation referenced in FIG. 3 (Technique 3000). In this particular example, instructions written for Architecture/z are being translated to PowerPC instructions. However, the same techniques are applicable for emulation of Architecture/z for other meta architectures; from other source architectures to the PowerPC architecture; and/or from other source architectures to other meta architectures.
042. Referring to FIG. 3, during emulation, an instruction, designated as instruction X, is obtained and interpreted as described in more detail in FIG. 4, STEP 300. Various statistics related to the interpreted instruction are updated, STEP 302, and then 304. Determine whether the next instruction has a previously translated entry point, QUERY 306. If not, further determine whether this next new instruction was seen N (eg, 15) times,QUERY 308. That is, if this instruction was seen often enough to optimize execution, for example, by compiling just-in-time (JIT) code, which provides an entry point for subsequent use. If this instruction is not seen N times, such as 15 times, then processing continues with STEP 300. Otherwise, processing continues with forming an instruction group and translating the instruction group from one architecture to another architecture , STEP 310. An example of such translation is described with reference to FIG. 5. After the group is formed and translated, the group is executed, STEP 312, and processing continues to STEP 304.
043. Returning to QUERY 306, if there is an existing translated entry point for the statement, processing continues with executing the group at the entry point, STEP 312.
044. More details regarding the interpretation of an instruction (Technical 2000) are described with reference to FIG. 4. Initially, an instruction at the next address of the program counter (PC) is read, STEP 400. This instruction is parsed and the opcode, register and immediate fields are extracted, STEP 402. Then a branch to the code is performed which emulates the behavior corresponding to the extracted opcode, STEP 404. The emulated code is then executed, STEP 406.
045 . More details regarding translating instructions within a group (Technique 3000) are described with reference to FIG. 5. Initially, an instruction in a predefined group of instructions is read, STEP 500. In one example, the group may be formed using a variety of ways. According to one embodiment, a group is formed to encompass a single execution path by a most likely path. In another embodiment, a group is formed to encompass one of the last previous execution paths, or the current execution path, based on the state of the emulated architecture. In another embodiment, all deviations are considered to be untraversed. In yet another embodiment, multiple paths are included in a group, such paths starting from the group's start point. In another embodiment, all instructions up to and including the first branch are added to a group (i.e., a group corresponds to a piece of code in a straight line also commonly called a “basic block”). In each embodiment, it is necessary to decide when and where to terminate a group. In one embodiment, the group is terminated after a fixed number of instructions. In another embodiment, the group terminates after a cumulative probability of reaching an instruction falls below a given threshold. In some embodiments, a group ends immediately when a stop condition is reached. In another set of embodiments, a group ends only at a well-defined point such as a "stopping point", e.g., a defined instruction, a specific group start alignment, or other conditions.
046. After that, the instruction is parsed and the opcode, register, and immediate fields are extracted from the instruction, STEP 502. Next, an internal representation of the extracted information is provided, STEP 504. This internal representation is a format of the information extracted that is used by the processor (eg, compiler or translator) to optimize decoding, register allocation, and/or other tasks associated with instruction translation.
047. In addition, it determines whether there is another instruction in the group to be translated, QUERY 506. In this case, processing continues with STEP 500. Otherwise, processing continues with the optimization of the internal representation, STEP 508, allocating one or more registers for the group of instructions, STEP 510, and generating code that emulates the instructions in the group, STEP 512.
048. While the above interpretation and translation procedures account for the emulation of an instruction defined in one architecture to one or more instructions defined in another architecture, advances can be made in emulating instructions that use non-contiguous specifiers. For example, in accordance with one aspect of the present invention, improvements are provided in emulation techniques to address the situation in which the register operand of an instruction is designated by multiple fields of the instruction.
049. One type of instruction that uses non-contiguous specifiers is vector instructions that are part of a vector resource, provided in accordance with an aspect of the present invention. In many vector instructions, the register field does not include all the bits needed to designate a register to be used by the instruction, but instead another field is used along with the register field to designate a register. This other field will be called the RXB field here.
050. The RXB field, also called the register extension bit, is, for example, a four-bit field (bits 0-3) that includes the most significant bit for each of the designated vector register operands of a vector instruction. Bits for register assignments not specified by the instruction must be reserved and set to zero.
051. In one example, the RXB bits are set as follows:
052. 0 - Most significant bit for the instruction's first vector register assignment.
053. 1 - Most significant bit for the instruction's second vector register designation, if any.
054. 2 - Most significant bit for the instruction's third vector register designation, if any.
055. 3 - Most significant bit for the fourth vector register assignment of the instruction, if any.
056. Each bit is set to zero or one by the assembler, for example, depending on the register number. For example, for registers 0-15, the bit is set to zero, for registers 16-31, the bit is set to 1, etc.
057. In one embodiment, each RXB bit is an extension bit for a given location in an instruction that includes one or more vector registers. For example, in one or more vector instructions, bit 0 of the RXB is an extension bit for location 8-11, which is assigned to V1; bit 1 of the RXB is an extension bit for location 12-15, which is assigned to, e.g., V2; and so on.
058. In a further embodiment, the RBX field includes additional bits and one or more bits are used as an extension for each vector or location.
059. In accordance with one aspect of the present invention, techniques are provided for transforming operands of non-contiguous specifiers into contiguous specifiers. Once transformed, contiguous specifiers are used in the same way as noncontiguous specifiers.
060. One embodiment of logic for emulating instructions using non-contiguous specifiers is described with reference to FIGs. 6-8. FIG. 6 in particular depicts an overview of an emulation process including one or more interpretations and translations of instructions that include non-contiguous specifiers; FIG. 7A depicts an embodiment of interpretation (Technique 6000), including interpretation of non-contiguous specifiers; FIG. 7B depicts an embodiment of transforming a noncontiguous specifier to a contiguous specifier; and FIG. 8 depicts a translation embodiment (Technique 7000) including translation of non-contiguous specifiers.
061. Referring initially to FIG. 6, an overview of an emulation process is provided. This overview is similar to the overview provided in FIG. 3, except that PASSO600 uses Technique 6000 described with reference to FIG. 7A instead of Technique 2000 referenced in STEP 300; and STEP 610 uses Technique 7000 described with reference to FIG. 8, instead of Technique 3000 referenced in STEP 310. Bearing in mind that the overview has been described above with reference to FIG. 3, will not be repeated here; instead, the discussion proceeds to the logic of FIG. 7A.
062. Referring to FIG. 7A, STEPS 700, 702, 704, and 706 are similar, respectively, to STEPS 400, 402, 404, and 406 of FIG. 4 and therefore will not be described again; however, STEPS 703 and 705 are described. With STEP 703, in accordance with one aspect of the present invention, a contiguous specifier (also called a contiguous index herein) is generated from a noncontiguous specifier. More details regarding the generation of a contiguous specifier from a non-contiguous specifier are described with reference to FIG. 7B.
063. Referring to FIG. 7B, in one embodiment, the non-contiguous specifier, STEP 750, is initially obtained. This includes, for example, determining from the opcode that the instruction has a non-contiguous specifier and determining which fields of the instruction are used to designate the non-contiguous specifier. For example, part of the opcode specifies an instruction format, and this format tells the processor that the instruction has at least one non-contiguous specifier and also specifies the fields used to designate the non-contiguous specifier. These fields are then read to get the data (e.g. bits) from them. For example, in many vector instructions, the instruction location 8-11 (eg, V1) specifies a plurality of bits (eg, 4) used to designate a vector register, and an RXB field of the instruction includes one or more additional bits used. to designate a specific vector record. These bits are obtained in this step.
064. Following obtaining the non-contiguous specifier (eg, bits from register field V1 and bit(s) from RXB), one or more rules are used to combine the parts of the non-contiguous specifier to create the contiguous specifier, STEP 752 These rule(s) depend on, for example, the instruction format as specified by the instruction opcode. In a particular example in which the opcode indicates an RXB field, the rule(s) encompasses the use of the RXB bit(s) associated with the register operand as the most significant bit(s) for the bits specified in the registration field. For example, the RXB field has, in one embodiment, 4 bits and each bit corresponds to a register operand. For example, bit 0 corresponds to the first register operand, bit 1 corresponds to the second register operand, and so on. Thus, the bit corresponding to the register operand is extracted and used to form the contiguous specifier. For example, if a binary 0010 is specified in the first operand register field and a binary 1000 is specified in the RXB field, the value of the bit associated with the first operand, bit 0, in this example is concatenated to 0010. Therefore, the specifier contiguous is 10010 (record 18) in this example.
065. The generated contiguous specifier is then used as if it were the specifier provided in the statement, STEP 754.
066. After that, returning to FIG. 7A, a branch is performed for code that emulates the behavior corresponding to the opcode, STEP 704. In addition, the contiguous index is used to manage the homogenized architecture resource despite the non-contiguous specifier, STEP 705. Thus, the record specifier contiguous is used as if there is no non-contiguous specifier. Each contiguous specifier indicates a record to be used by the emulation code. After that, the emulation code is performed, STEP 706.
067. In FIG. 8 describes more details regarding the translation including the transformation of non-contiguous specifiers into contiguous specifiers (designated Technique 7000). In one embodiment, STEPS 800, 802, 804, 806, 808, 810, and 812 are similar to STEPS 500, 502, 504, 506, 508, 210, and 512, respectively, of FIG. 5 and therefore not described herein with reference to FIG. 8. However, in accordance with one aspect of the present invention, further steps are performed in order to transform a non-contiguous specifier of a source architecture instruction into a contiguous specifier of a meta-architecture instruction. The meta architecture instruction emulates a function of the source architecture instruction.
068. For example, in STEP 803, a contiguous specifier is generated from a non-contiguous specifier. As described above with reference to FIG. 7B, this includes getting the noncontiguous specifier from the instruction to be emulated, and using one or more rules to create the contiguous specifier from the noncontiguous specifier. In one embodiment, the opcode of the instruction having the noncontiguous specifier indicates, at least implicitly by its format, that the instruction includes a noncontiguous specifier. For example, the instruction format is indicated by one or more opcode bits (eg, the first two bits), and based on the format, the processor (eg, compiler, translator, processor emulator) understands that this instruction includes a non-contiguous specifier, in which part of the specifier of a resource, such as the record, is included in a field of the instruction and one or more parts of the specifier are located in one or more different fields of the instruction.
069. The opcode, as an example, also provides an indication to the processor of one or more rules used to generate the contiguous specifier from the noncontiguous specifier. For example, the opcode may indicate that a particular instruction is a vector register instruction, and thus has an RXB field. Therefore, the processor accesses information (e.g., rules stored in memory or external storage) that indicates an instruction with an RXB field, the RXB field provides the most significant bit for its corresponding register field. The rules specify, for example, that to generate the contiguous field, bits from the register field are combined with one or more bits from the RXB field associated with the particular register operand.
070. After the contiguous specifier is generated, the contiguous specifier is used without regard to the noncontiguous specifier. For example, in STEP 808, the code is optimized using the contiguous specifier without considering the non-contiguous specifier. Similarly, one or more records are allocated using the contiguous specifier and disregarding the noncontiguous specifier, STEP 810. Furthermore, in STEP 812, the emulated code is generated without considering the noncontiguous specifier and using the allocation performed in STEP 810. This means that in these steps there is no indication that the contiguous specifier was generated from a noncontiguous specifier. The non-contiguous specifier is ignored.
071. Additional details on translating a noncontiguous specifier to a contiguous specifier are described with reference to the examples in FIGS. 9A, 9B and 11. Referring initially to FIG. 9A, a Load Vector (VL) instruction 900 is depicted. In one example, the Load Vector instruction includes opcode fields 902a (eg, bits 0-7), 902b (eg, bits 40-47) indicating an operation of Load Vector; a vector register field 904 (e.g., bits 8-11) used to designate a vector register (V1); an index field (X2) 906 (e.g., bits 12-15); a base field (B2) 908 (e.g., bits 16-19); a shift field (D2) 910 (e.g., bits 20-31); and an RXB field 912 (e.g., bits 36-39). Both fields 904-912 in an example are separate and independent of the opcode field(s). Furthermore, in one embodiment they are separate and independent of each other; however, in other embodiments, more than one field may be combined. More information about the use of these fields is described below.
072. In one example, selected bits (e.g., the first two bits of the opcode designated by operational field 902a) specify an instruction size and format. In this particular example, the size is 3 half records (half-words) and the format is vector record storage operation and index with an extended field of opcode field. The vector field (V1), together with its corresponding extension bit specified by RXB, designates a vector register (i.e., a non-contiguous specifier). For vector registers in particular, the register that contains the operand is specified using, for example, a four-bit field of the register field with the addition of its register-extension bit (RXB) as the most significant bit. For example, if the four-bit field in V1 is binary 0010 and the extension bit for this operand is binary 1, then the five-bit field is binary 10010, indicating register number 18 (in decimal).
073. The subscript number associated with the instruction field denotes the operand to which the field applies. For example, the subscript number 1 associated with V1 denotes the first operand, and so on. This is used to determine which bit of the RXB field is matched to the record field. The register operand is a register in length, for example, 128 bytes. In one example, in a vector and index register store operation instruction, the contents of the general registers designated by fields X2 and B2 are added to the contents of field D2 to form the second operand address. In one example, the offset, D2, for the Load Vector instruction is treated as a 12-bit uninscribed integer.
074. In this example, since V1 is the first operand, the leftmost location (e.g., bit 0) of the RXB is associated with this operand. Therefore, the leftmost value is combined with the value in the V1 record field to generate the contiguous specifier, as described here.
075. In accordance with one aspect of the present invention, Load Vector instruction 900, which is defined, for example, in the /z Architecture, is emulated in an Indexed Load Vector instruction 950, defined, for example, in the PowerPC architecture. Although in this case Architecture/z is the source architecture and PowerPC is the target architecture, this is just an example. Many other architectures can be used as one or both of the source and meta architectures.
076. Each architecture has specific records associated with it that it can use. In Architecture/z, for example, there are 32 vector registers and other types of registers can map a quadrant of the vector registers. As an example, as shown in FIG. 10, if there is a register file 1000 that includes 32 vector registers 1002 and each register is 128 bits long, then 16 floating point registers 1004, 64 bits long, can overlap the vector registers. So, as an example, when floating point register 2 is modified, then vector register 2 is also modified. Other mappings to other record types are also possible.
077. Similarly, PowerPC or another meta architecture has a recordset assigned to it. This recordset can be different or the same as the recordset allocated to the source architecture. The meta register may have more or less registers available for a particular type of instruction. For example, in the case depicted in FIG. 9A, the Load Vector instruction and the Load Vector Indexed instruction have 32 vector registers available. Other examples are also possible.
078. As indicated by the opcode, the Load Vector instruction includes a non-contiguous specifier which, in this example, is represented in the V1 and RXB fields. These non-contiguous fields are combined to create a contiguous index in the Load Vector Indexed instruction 950. This contiguous specifier is indicated in the VRT field 954 of the 950 instruction. In this particular example, as shown in the VL v18 code, 0(0, gr5) , the vector register being specified is register 18. This register is specified in the instruction by the specifier provided by the V1 field and the RXB field. In this example, the V1 field includes a value of 2 (binary 0010) and the RXB field includes a value of 8 binary 1000). Based on predefined rules, since V1 is the first operand, the leftmost bit (1) of 1000 is concatenated with the bits in field V1 (0010) to produce a contiguous specifier of 10010, which is the value 18 in decimal.
079. As shown in reference numeral 956, a representation of 18 is placed in the VRT field of the Load Vector Indexed instruction, which corresponds to the register field (V1) of the Load Vector instruction. For completeness, the RA and RB fields of instruction 950 correspond, respectively, to X2 and B2 of instruction 900. Field D2 of instruction 900 has no corresponding field in instruction 950; and the opcode fields of instruction 900 correspond to the opcode fields of instruction 950.
080. Another example is depicted in FIG. 9B. In this example, as in the example depicted in FIG. 9A, the noncontiguous specifier (V1, RXB) of instruction 900 is being transformed into a contiguous specifier (VRT) of instruction 950. However, in this example, the register allocated to instruction 950 does not have the same number as the transformed contiguous specifier; instead, the contiguous specifier is mapped to a different record. For example, in the case of FIG. 9A, the noncontiguous specifier references register 18 in the same way as the contiguous specifier. This means that there is a one-to-one mapping. However, in FIG. 9B, the noncontiguous specifier of 18 is transformed into a contiguous specifier of 18, but then the 18 of the contiguous specifier is mapped to a different register, such as register 7 (see reference number 890), i.e. register 18 in the source architecture maps to record 7 in the meta architecture in this particular example. Such mapping is predefined and accessible to the processor.
081. Yet another example is depicted in FIG. 11. In this example, instead of allocating to a register during emulation, as in FIGs. 9A and 9B, the allocation is to memory. In this example, a VLR instruction is used to move the contents of a vector register, VR 18 to another vector register, VR 24. However, in this example, suppose that the register file is not large enough to include these vector registers, then , memory is used. There is a contiguous part of memory that stores a plurality of vectors as an array. The array starts at an address, rvbase, at which the first record, e.g., record 0, is stored; and then the next record is stored at an offset, e.g., 16 bytes, from rvbase; and the third record is stored at an offset from the second record, and so on. So, in this example, register 18 is at offset 288 from rvbase, and register 24 is at offset 384 from rvbase.
082. In this example, there are two non-contiguous specifiers (V1, RXB; and V2, RXB). Thus, two contiguous specifiers are generated. For example, since V1 is the first operand, the first contiguous specifier is generated by concatenating the bits in V1 with bit 0 of RXB. Since V1 includes 1000 in binary (8 decimals) and RXB includes 1100 in binary (12 decimals), the first contiguous specifier is formed by concatenating 1 (from bit 0 of RXB) with 1000 (from V1) giving 11000 (24 in decimals). Similarly, the second contiguous specifier is generated by concatenating 0010 (2 in decimals for V2) and 1 (from bit 1 of RXB) giving 10010 (18 in decimals). Since these registers are in memory, vector register 24 is at offset 384 from rvbase, and vector register 18 is at offset 288 from rvbase. These values are shown in FIG. 11 at 1102, 1104, respectively.
083. The pseudo code on the right in FIG. 11 and the instructions on the left describe moving a contiguous number of bytes that correspond to a vector register at a vector offset by 18 (which corresponds to a byte offset at 288) to a vector offset at 24 (which corresponds to a byte offset at 24). bytes at 384). In particular, an immediate load (L1) loads a value of 288 into rtemp1, and then a vector load is performed at an address given by rvbase plus the offset into rtemp1, and the value is stored in a temporary vector register, vtemp2. Then the next immediate load loads 384 into rtemp1, and a back-memory store is performed at a location that corresponds to the address plus the offset in vector register 24 (e.g., offset 288).
084. While several examples are described above, many other examples and variations are possible. Additional information regarding vector instructions and the use of the RXB field is described in a patent application filed with this one entitled “Instruction to Load Data Up to A Specified Memory Boundary Indicated by the Instruction.” Serial No...., (IBN Tag No. POU920120030US1), Jonathan D. Bradbury et al, which is incorporated herein by reference in its entirety.
085. In addition, several architectures are mentioned here. An embodiment of the/z Architecture is described in the IBM® publication entitled “z/Architecture Principles of Operation”, IBM® Publication No. SA22-7832-08, Ninth Edition, August 2010, which is being incorporated herein by reference, in its entirety. IBM® and Z/ARCHITECTURE® are registered trademarks of International Business Machines Corporation, Armonk, New York, USA. Other names used herein may be trademarks, trademarks, or product names of International Business Machines Corporation or other companies. Additionally, an embodiment of the Power Architecture is described in Power ISA™ Version 2.06 Revision B, International Business Machines Corporation, July 23, 2010, which is incorporated herein by reference in its entirety. POWER ARCHITECTURE® is a registered trademark of International Business Machines Corporation. Further, an embodiment of Intel architecture is described in Intel® 64 and IA-32 Architecture Developers Handbook: Vol. 2B, Instruction Set Reference, AL, Order Number 253666-041US, December 2011, and Intel® 64 and IA-32 Architecture Developers Handbook: Vol. 2B, Instruction Set Reference, MZ, Order Number 253667-041US, December 2011, each of which is incorporated herein by reference in their entirety. Intel® is a registered trademark of Intel Corporation, Santa Clara, California.
086. Described in detail here is a technique for transforming noncontiguous specifiers from an instruction defined for one system architecture to contiguous specifiers for an instruction defined for another system architecture. Previous architectural emulations have not been successful in driving emulation of systems with non-contiguous specifiers, and particularly non-contiguous register specifiers, in either fixed-width or variable-width instruction sets. However, in accordance with one aspect of the present invention, a technique is provided for extending existing emulators to control non-contiguous specifiers. The technique includes, for example, reading non-contiguous specifiers, generating a contiguous index from a non-contiguous specifier, and using the contiguous index to access or represent a homogeneous resource.
087. In a further embodiment, according to a JIT (just in time) implementation, a contiguous index is used to make allocation decisions, optionally representing a resource accessed by a non-contiguous specifier by a non-contiguous/non-homogeneous resource, but which does not reflect partitioning by non-contiguous specifier boundaries, but by optimization decisions. It means that, in one embodiment, an instruction defined for an architecture has at least one noncontiguous specifier for at least one resource, and that at least one noncontiguous specifier is transformed into at least one contiguous specifier. This contiguous specifier is used to select at least one resource for an instruction from another architecture to use. The other architecture's instruction, however, uses non-contiguous specifiers. Thus, at least one contiguous specifier for at least one selected resource is then transformed into a noncontiguous at least specifier for use by the instruction of the second architecture. In one embodiment, this is accomplished by an emulator.
088. In one embodiment, an emulator is provided for emulating instruction execution of a set of instructions of a first computer architecture on a processor designed for a second computer architecture. The emulator includes, for example, fetching an application's instructions by the emulation program; opcode interpretation of instructions in order to select an emulation module to emulate the instructions; determining from the opcode that the instructions employ non-contiguous record fields; combining non-contiguous record fields from the instruction to form a combined record field; and using the record field combined by the emulation module instructions in order to emulate the instructions.
089. Furthermore, in one embodiment, the register space includes a subsection, and the instruction set of the first computer architecture includes first instructions with register fields for accessing the subsection only, and second instructions with non-contiguous register fields for access to the entire registry space.
090. In one embodiment, the RXB field is at the same location for all instructions using the RXB field. RXB bits are significant bits; bit 36 of the RXB field is used to extend bits 8-11 of the instruction; RXB bit 37 is used to extend bits 12-15; RXB bit 38 is used to extend bits 16-19; and RXB bit 39 is used to extend bits 32-35, as examples. Also, the decision to use an RXB bit as an extension bit depends on the opcode (e.g., R1 vs V1). Furthermore, non-contiguous specifiers can use fields other than the RXB fields.
091. In this document, memory, main memory, storage, and main storage are used interchangeably unless otherwise noted explicitly or by context.
092. Additional details regarding the vector resource, including example instructions, are contained as part of this Detailed Description further below.
093. As those skilled in the art will understand, one or more aspects of the present invention may be embodied as a system, method, or computer program product. As a result, one or more aspects of the present invention may be in the form of an all-hardware, all-software (including firmware, resident software, microcode, etc.) embodiment, or an embodiment combining software and hardware aspects that may be designated herein. documents generally as “circuit”, “module” or “system”. Furthermore, one or more aspects of the present invention may be in the form of a computer program product embedded in one or more computer readable media having computer readable program code embedded therein.
094. Any combination of one or more computer readable media may be used. Computer readable media can be computer readable storage media. Computer readable storage media may be, for example, without limitation electronic, magnetic, optical, electromagnetic, infrared or semiconductor system, equipment or device, or any combination thereof. More specific examples (a non-exhaustive list) of computer readable storage media include the following: electrical connection with one or more cables, PC disk, hard disk, RAM memory, ROM memory, EPROM memory or Flash memory, optical fiber , CD-ROM, optical storage device, magnetic storage device, or any combination thereof. In the context of this document, computer readable storage media can be any tangible media that can contain or store a program for use by or in connection with an instruction execution system, equipment or device.
095. Referring now to FIG. 12, in one example, a computer program product 1200 includes, for example, one or more non-transient computer readable media 1202 for storing computer readable program code means or logic 1204 to provide and facilitate one or more aspects of the present invention.
096. A program code embedded in computer readable media may be transmitted using appropriate media, including, without limitation, wireless and wired networks, fiber optic cable, radio frequency, etc., or any combination thereof.
097. Computer program code to perform operations for one or more aspects of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages such as Java, Smalltalk, C++ or similar, and conventional procedural programming languages, such as C programming language, assembler, or the like. The program code may run entirely on the user's computer, partly on the user's computer, as a standalone software package, partly on the user's computer and partly on a remote computer, or entirely on a remote computer or server. In the latter scenario, the remote computer can be connected to the user's computer through any type of network, including local area network (LAN) or wide area network (WAN), or the connection can be made to an external computer (e.g., over the internet). , using a provider).
098. In this document, one or more aspects of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, devices (systems) and computer program products in accordance with embodiments of the invention. It is understood that each block of the flowchart illustrations or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams may be implemented by computer program instructions. Such computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, or other programmable data processing equipment to produce a machine, such that the instructions that are executed by the computer's processor or by other programmable data processing equipment, create means to implement the functions/actions specified in the flowchart and/or block diagram.
099. Such computer program instructions may also be stored on a computer, other programmable data processing equipment, or other devices to cause a series of operational steps to be performed on the computer, other programmable data processing equipment or other devices for producing a computer-implemented process such that instructions that are executed on a computer or other programmable data processing equipment provide processes for implementing the functions/actions specified in the flowchart and/or block diagram.
0100. The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products in accordance with various embodiments of one or more aspects of the present invention. In this regard, each block in flowcharts or block diagrams may represent a module, segment or part of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions checked in the block may occur out of the order mentioned in the figures. For example, two blocks shown in succession can actually run largely simultaneously, or the blocks can sometimes run in reverse order, depending on the functionality involved. It may be further noted that each block of the block diagram illustration and/or flowchart, and combination of blocks in the block diagram illustration and/or flowchart may be implemented by special-purpose hardware-based systems that perform the functions or particular acts, or combinations of special-purpose hardware and computer instructions.
0101. In addition to the foregoing, one or more aspects of the present invention may be provided, offered, used, managed, verified, etc. by a service provider that provides management of customer environments. For example, the service provider may create, maintain, support, etc. computer code and/or computing infrastructure that performs one or more aspects of the present invention for one or more customers. On the other hand, the provider may receive payment from the customer through subscription and/or contract, as examples. In addition or alternatively, the provider may receive payment for the sale of advertising content to third parties.
0102. In one aspect of the present invention, an application may be used to carry out one or more aspects of the present invention. By way of example, using an application comprises providing operational computing infrastructure to carry out one or more aspects of the present invention.
0103. As a further aspect of the present invention, a computing infrastructure comprising integrating a computer readable code into a computer system, in which the code in combination with the computer system is capable of realizing one or more aspects of the present invention.
0104. In yet another aspect of the present invention, a computing infrastructure integration process comprising integrating a computer readable code into a computing system may be provided. The computer system comprises computer readable media, which media comprises one or more aspects of the present invention. The code in combination with the computer system is capable of realizing one or more aspects of the present invention.
0105. Although various embodiments are described above, these are examples only. Computing environments of other architectures may incorporate and use one or more aspects of the present invention. In addition, vectors of other sizes or other registers may be used, changes may be made to the instructions without departing from the spirit of the present invention. Additionally, other instructions can be used in processing. Furthermore, one or more aspects of the invention relating to the transformation of non-contiguous specifiers into contiguous specifiers may be used in other contexts, and the specifiers may have a purpose other than registration. Other changes are also possible.
0106. In addition, other types of computing environments may benefit from one or more aspects of the present invention. For example, a data processing system suitable for storing and/or executing program code that includes at least two processors directly or indirectly coupled to memory elements via a system bus. Memory elements include, for example, local memory employed during the actual execution of program code, mass storage, and cache memory that provide temporary storage of some program code in order to reduce the number of times the code must be retrieved from mass storage at runtime.
0107. Input/Output Devices or I/O (including without limitation keyboards, monitors, pointing devices, direct access storage devices (DASD), tape, CDs, DVDs, USB sticks and other memory media, etc.) can be coupled to the system either directly or via intervening I/O controllers. Network adapters may also be coupled to the system to allow the data processing system to be coupled to other data processing systems or remote printers or storage devices via intervening private or public networks. Modems, cable modems, Ethernet cards are just some of the available types of network adapters.
0108. Referring to FIG. 13, representative components of a Host Computer 5000 system for implementing one or more aspects of the present invention are depicted. Host computer 5000 comprises one or more CPUs 5001 communicating with computer memory (ie, central storage) 5002, as well as I/O interfaces for storing media devices 5011 and networks 5010 for communicating with other computers or area networks (SANs) and the like. The 5001 CPU supports an architecture with an architected instruction set and functionality. The CPU 5001 can have dynamic address translation (DAT) 5003 to transform program addresses (virtual addresses) into real memory addresses. A typical DAT includes associative memory (TLB) 5007 to capture translations so that future accesses to computer memory block 5002 do not require address translation time. Typically, a 5009 cache is employed between the memory of the 5002 computer and the 5001 processor. The 5009 cache can be tiered, with a large cache available for more than one CPU and smaller, faster (lower level) caches between the large cache and each CPU. In some implementations, the lower-level caches are split to provide separate lower-level caches for instruction fetching and data accesses. In one embodiment, an instruction is fetched from memory 5002 by an instruction fetch unit 5001 via a cache 5009. The instruction is decoded in an instruction decoding unit 5006 and sent (with other instructions in some embodiments) to the unit(s). ) of instruction execution 5008. Several execution units 5008 are typically employed, for example, arithmetic execution unit, floating point execution unit, and branch instruction execution unit. The instruction is executed by the execution unit, accessing operands from specified instruction registers or memory as needed. If an operand is to be accessed (loaded or stored) from memory 5002, the load/store unit 5005 keeps access under the control of the instruction being executed. Instructions can be executed in hardware circuits or in internal microcode (firmware) or a combination of the two.
0109. As noted, a computer system includes information in local (or main) storage, as well as addressing, protection, and recording reference and change. Some aspects of addressing include the format of addresses, the concept of address spaces, the various types of addresses, and the way in which one type of address is translated into another type of address. Some primary stores include permanent assignment storage leases. Main storage provides the system with directly addressable, fast-access data storage. Both data and programs must be loaded into main storage (from input devices) before they can be processed.
0110. Main storage may include one or more smaller, faster-access buffers, sometimes called caches. A cache is usually physically associated with the CPU or an I/O processor. The effects, other than execution, of physical construction and use of separate storage media are generally not observable by the program.
0111. Separate caches can be maintained for instructions and data operands. The information in the cache is held in contiguous bytes in an integral constraint called a cache block or cache line (or simply line). A template may provide an EXTRACT CACHE ATTRIBUTE statement that returns the cache line size in bytes. A model may also provide DATA PREFEEK and EXTENSIVE RELATIVE DATA PREFEEK instructions that prefetch cached data or instructions or flush data from the cache.
0112. Storage is visualized as a long, horizontal bit string. For most operations, storage accesses are processed in a left-to-right sequence. The bit string is subdivided into eight-bit units. The eight-bit unit is called a byte, which is the basic building block of all information formats. Each stored byte location is identified by a unique non-negative integer number, which is the address of that byte location, or simply byte address. Adjacent byte locations have consecutive addresses, starting with 0 on the left and continuing in sequence from left to right. Addresses are unsigned binary integers and are 24, 31, or 64 bits long.
0113. Information is transmitted between storage and the CPU or a one-byte channel subsystem, or a group of bytes, simultaneously. Unless otherwise specified, in Architecture/z, for example, a group of stored bytes is addressed by the leftmost byte of the group. The number of bytes in the group is specified implicitly or explicitly by the operation to be performed. When used in a CPU operation, the group of bytes is called a field. Within each group of bytes in the /z Architecture, for example, the bits are numbered in a sequence from left to right. In the /z Architecture, the leftmost bits are sometimes called high-order bits and the rightmost bits, low-order bits. However, bit numbers are not storage addresses. Only bytes can be addressed. To operate with individual bits of a stored byte, the entire byte is accessed. The bits in a byte are numbered from 0 to 7, from left to right (e.g., in the /z Architecture). Bits in an address can be numbered 8-31 or 40-63 for 24-bit addresses, or 1-31 or 33-63 for 31-bit addresses; they are numbered 0-63 for 64-bit addresses. In any other multi-byte fixed-length format, the bits that make up the format are numbered consecutively starting at 0. For the purpose of error detection, and in preference to correction, one or more check bits may be transmitted with each byte. or group of bytes. Such check bits are generated automatically by the machine and cannot be directly controlled by the program. Storage capacities are expressed in number of bytes. When the length of a storage operand field is suggested by the operating code of an instruction, the field is said to have a fixed length, which can be one, two, four, eight, or sixteen bytes. Larger fields may be suggested by some instructions. When the length of a storage operand field is not suggested but explicitly determined, the field is said to have a variable length. The length of variable length operands can be varied by one-byte increment (or with some instructions, in multiples of two bytes or other multiples). When information is stored, only the contents of those byte locations that are included in the designated field are replaced, even though the width of the physical path to storage may be greater than the length of the field being stored.
0114. Certain information units must be at an integral limit in storage. A boundary is called an integral for a unit of information when its storage address is a multiple of the unit's length in bytes. Special names are given to 2, 4, 8 and 16 byte fields on an integral boundary. Halfword is a group of two consecutive bytes on a two-byte boundary and the basic building block of instructions. Word is a group of four consecutive bytes on a four-byte boundary. A double word is a group of eight consecutive bytes on an eight-byte boundary. Quadword is a group of 16 consecutive bytes on a 16-byte boundary. When storage addresses designate halfwords, doublewords, doublewords, and quadruplewords, the binary representation of the address contains one, two, three, or four trailing zero bits, respectively. Instructions must be on a two-byte integral boundary. The storage operands of most instructions have no boundary alignment requirements.
0115. On devices that implement separate caches for instructions and data operands, a significant delay can be experienced if the program stores in a cache line from which instructions are subsequently fetched, regardless of whether the store changes instructions that are subsequently fetched.
0116. In one embodiment, the invention may be practiced by software (sometimes referred to as licensed internal code, firmware, microcode, millicode, picocode, and the like, any of which would be consistent with one or more aspects of the present invention). With reference to FIG. 13, software program code embodying one or more aspects of the present invention is accessible by processor 5001 of host system 5000 from long-term storage media devices such as a CD-ROM drive, tape drive, or hard drive. Software program code may be embedded in any of a variety of media known for use in a data processing system, such as a floppy disk, hard disk, or CD-ROM. The code may be distributed on such media or may be distributed to users from computer memory 5002 or storage on a networked computer system 5010 to other computer systems for use by users or other systems.
0117. Software program code includes an operating system that controls the function and interaction of the various computer components and one or more application programs. Program code is normally communicated from the storage media device 5011 to a higher relative speed computer storage 5002 where it is available for processing by the processor 5001. Techniques and methods for embedding software program code in memory, in physical media and/or distributing software code via networks are well known and will not be discussed further here. Program code, when created and stored on tangible media (including, without limitation, electronic memory modules (RAM), flash memory, CDs, DVDs, Magnetic Tape and the like) is often referred to as a “computer program product”. The computer program product media is generally readable by a processor circuit preferably in a computer system for execution by the processor circuit.
0118. FIG. 14 illustrates a representation of a workstation or server hardware system in which one or more aspects of the present invention may be practiced. The system of FIG. 14 comprises a representation of a base computer system 5021, such as a personal computer, a workstation or a server, including optional peripheral devices. The base computer system 5021 includes one or more processors 5026 and a bus employed to connect or allow communication between the processor(s) 5026 and the other components of the system 5021 in accordance with known techniques. The bus connects the 5026 processor to the 5025 memory and 5027 long-term storage which can include hard disk (including any magnetic media, CD, DVD and flash memory, for example) or a tape disk, for example. The 5021 system may also include a user interface adapter, which connects the 5026 microprocessor over the bus with one or more interface devices, such as the 5024 keyboard, 5023 mouse, 5030 printer/scanner, and/or other interface devices, which can be any user interface device such as a touch screen, digital input pad, etc. The bus also connects a 5022 display device, such as an LCD screen or monitor, to the 5026 microprocessor via an adapter.
0119. System 5021 can communicate with other computers or computer networks through a network adapter capable of communicating 5028 with a 5029 network. Communication channels, token ring network, Ethernet, or modems. Alternatively, the 5021 system can communicate using a wireless interface such as a CDPD (Cellular Digital Data Packet) card. System 5021 may be associated with other computers on a local area network (LAN) or wide area network (WAN), or system 5021 may be a client in a client/server arrangement with another computer, etc. All of these configurations, as well as the appropriate hardware and software communications, are known in the art.
0120. FIG. 15 illustrates a data processing network 5040 in which one or more aspects of the present invention may be practiced. Data processing network 5040 may include a plurality of individual networks, such as a wireless or wired network, each may include a plurality of individual workstations 5041, 5042, 5043, 5044. Additionally, as experts will appreciate, one or more LANs may be included, wherein a LAN may include a plurality of intelligent workstations coupled to a host processor.
0121. Still referring to FIG. 15, the network may also include mainframe computers or servers, such as a gateway (intermediate) computer (client server 5046) or application server (remote server 5048, which can access a data repository and can also be accessed directly from a workstation 5045). The 5046 gateway computer serves as the entry point into each individual network. A gateway is required when connecting one network protocol to another. Gateway 5046 can preferably be coupled to another network (internet 5047, for example) via a communications link. Gateway 5046 may also be directly coupled to one or more workstations 5041, 5042, 5043, 5044 using a communications link. The gateway computer can be deployed using an IBM eServer™ System z server from International Business Machines Corporation.
0122. Referring simultaneously to FIGs 14 and 15, software programming code that may incorporate one or more aspects of the present invention may be accessed by processor 5026 of system 5020 from long-term storage media 5027, such as CD -ROM or hard disk. The software's programming code may be embedded in any of a variety of media known for use with a data processing system, such as a floppy disk, hard disk, or CD-ROM. The code may be distributed on such media or may be distributed to users 5050, 5051 from computer memory or storage on a networked computer system to other computer systems for use by users or other systems.
0123. Alternatively, the program code may be embedded in memory 5025 and accessed by a processor 5026 using the processor bus. Such programming code includes an operating system that controls the function and interaction of the various computer components and one or more application programs 5032. The program code is normally communicated from the storage media device 5027 to higher speed memory 5025 where it is available for processing by the 5026 processor. Techniques and methods for embedding software programming code in memory, on physical media, and/or distributing software code over networks are well known and will not be discussed further here. Program code, when created and stored on tangible media (including, without limitation, electronic memory modules (RAM), flash memory, CDs, DVDs, Magnetic Tape and the like) is often referred to as a “computer program product”. The computer program product media is generally readable by a processor circuit preferably in a computer system for execution by the processor circuit.
0124. The cache most immediately available to the processor (typically faster and smaller than other processor caches) is the lowest cache (L1 or level 1) and the main store (main memory) is the highest level cache ( L3, if there are 3 levels). The lower-level cache is often divided into the instruction cache (I-Cache) containing the machine instructions to be executed and the data cache (D-Cache) containing the data operands.
0125. Referring to FIG. 16, a model processor embodiment is depicted for processor 5026. Typically, one or more levels of cache 5053 are employed to buffer memory blocks in order to improve processor performance. The 5053 cache is a high-speed buffer containing cache lines of memory data that are likely to be used. Typical cache lines are 64, 128, or 256 bytes of memory data. Separate caches are often employed to capture instructions instead of capturing data. Cache coherence (synchronization of copies of lines in memory and the caches) is usually provided by various snooping algorithms well known to experts. Storage in the 5025 main memory of a processor system is commonly referred to as a cache. In a processing system with 4 5053 cache levels, the 5025 main storage is sometimes referred to as a level 5 (L5) cache, as it is typically faster and retains only a portion of the non-volatile storage (DASD, tape, etc.) ) that is available in a computer system. The 5025 main storage saves pages of data in and out of the 5025 main storage sent by the operating system.
0126. A program counter (instruction counter) 5061 keeps track of the address of the current instruction to be executed. On the Architecture/z processor the program counter is 64 bits and can be cut to 31 or 24 bits to support earlier addressing limits. The computer program is typically embedded in a computer's PSW (program status word) in such a way that it persists during context switching. Thus, a program in progress, having a program counter value, can be interrupted by the operating system, for example (context change from the program environment to the operating system environment). The program PSW holds the value of the program counter while the program is not active, and the program counter (in PSW) of the operating system is used while the operating system is running. Typically, the program counter is incremented by a value equal to the number of bytes in the current instruction. RISC (Shortened Computer Instruction Set) instructions have a fixed length while CISC (Complex Computer Instruction Set) instructions have a variable length. IBM Architecture/z instructions are CISC instructions with a length of 2, 4, or 6 bytes. Program counter 5061 is modified by either a context switch operation or a branch take operation of a branch instruction, for example. In a context switch operation, the current value of the program counter is saved in the PSW along with other state information of the program being executed (such as condition codes), and a new value of the program counter is loaded pointing to a program instruction. a new program module to be executed. A branch take operation is performed to allow the program to make decisions or loop within the program by loading the result of the branch instruction into program counter 5061.
0127. The instruction fetch unit 5055 is typically employed to fetch instructions on behalf of the processor 5026. The fetch unit or fetch the “new sequential instructions”, the meta instructions of branch-taking instructions, or the first instructions of a program following the context change. Modern instruction fetch units often employ prefetching techniques to do speculative prefetching of instructions based on the probability that the prefetched instructions can be used. For example, a fetch unit might fetch 16 bytes of instruction that include the next sequential instruction and additional bytes for additional sequential instructions.
0128. The fetched instructions are then executed by the processor 506. In one embodiment, the fetched instructions are passed to a send unit 5056 of the fetch unit. The sending unit decodes the instructions and forwards the information about the decoded instructions to the appropriate units 5057, 5058, 5060. The executing unit 5057 will receive the information about the decoded arithmetic instructions from the instruction fetch unit 5055 and perform arithmetic operations on operands according to the instruction's opcode. Operands for execution unit 5057 are preferably provided from memory 5025, from architected registers 5059, or from an immediate field of the instruction being executed. Execution results, when stored, are stored either in 5025 memory, 5059 registers, or other hardware (such as control registers, PSW registers, and the like).
0129. A typical 5026 processor has one or more 5057, 5058, 5060 units to perform the instruction's function. With reference to FIG. 17A, an execution unit 5057 may communicate with general architected registers 5059, a decode/send unit 5056, a load/storage unit 5060, and other processor units via interface logic 5071. An execution unit 5057 may employ various register circuits 5067, 5068, 5069 to retain information on which the arithmetic and logic unit (ALU) 5066 will operate. The ALU performs arithmetic operations such as addition, subtraction, multiplication, and division, as well as logical functions such as and, or, and exclusive-or (XOR), rotation, and movement. Preferably the ALU supports specialized operations that are project dependent. Other circuits may provide other 5072-architected features including condition codes and recovery support logic, for example. The typical result of an ALU operation is held in a 5070 output register circuit that can forward the result to a variety of other processing functions. There are several arrays of processor units; the present description is only intended to provide a representation of the understanding of an embodiment.
0130. An ADD instruction, for example, would be executed in a 5057 execution unit having arithmetic and logic functionality while a floating-point instruction would be executed in floating-point execution having specialized floating-point capability. Preferably, an execution unit operates on operands identified by an instruction by performing a function defined in the opcode on the operands. For example, an ADD instruction can be executed by a 5057 execution unit on operands found in two 5059 registers identified by the instruction's register fields.
0131. Execution unit 5057 performs arithmetic addition on two operands and stores the result in a third operand where the third operand can be a third register or one of the two source registers. The execution unit preferably uses a 5066 Arithmetic Logic Unit (ALU) which is capable of performing a variety of logic functions such as Move, Rotate, AND, Or and XOR, as well as a variety of algebraic functions including addition, subtraction, multiplication, division. Some 5066 ALUs are designed for scalar operations and others for floating point. Data can be Big Endian (where the least significant byte is at the highest byte address) or Little Endian (where the least significant byte is at the lowest byte address) depending on the architecture. IBM's Architecture/z is Big Endian. Fields marked can be sign and magnitude, 1's complement or 2's complement depending on the architecture. A 2's complement number is advantageous because the ALU does not need to design a subtraction capability, since both a negative value and a positive value in 2's complement only need addition within the ALU. Numbers are commonly described abbreviated, where a 12-bit field defines a 4,096-byte block address and is commonly described as a 4 Kbyte (kilobyte) block, for example.
0132. Referring to FIG. 17B, branch instruction information for executing a branch instruction is sent to a branch unit 5058 which often employs a branch prediction algorithm such as branch history table 5082 to predict the branch result before completing other operations. conditional. The target of the current branch instruction will be fetched and executed speculatively before completing conditional operations. When conditional operations are complete, speculatively executed branch instructions either complete or are discarded based on the conditions of the conditional operation and the speculated result. A typical branch instruction might test condition codes and branch to a target address; if the condition codes meet the branching instruction's branch requirements, a target address can be calculated based on various numbers including those found in record fields or in a field immediately following the instruction, for example. The bypass unit 5058 may employ an ALU 5074 with a plurality of register input circuits 5075, 5076, 5077 and an output register circuit 5080. The bypass unit 5058 may communicate with general registers 5059, with the bypass unit 5058. decoding and sending 5056 or with other 5073 circuits, for example.
0133. Execution of a group of instructions can be interrupted for a variety of reasons, including a context switch initiated by an operating system, a program exception or error causing a context switch, an I/O signal interrupt causing a switch of context or multithreaded activity of a plurality of programs (in a multithreaded environment), for example. Preferably, a context switch action saves state information about a running program and then loads state information from another program being called. State information can be saved in hardware registers or in memory, for example. The status information preferably comprises a program counter pointing to the next instruction to be executed, condition codes, memory translation information and architected register contents. The context switching activity may be performed by hardware circuits, application programs, operating system programs, or firmware code (microcode, picocode, or licensed internal code (LIC) alone or in combination.
0134. A processor accesses operands according to defined instruction methods. An instruction may provide an immediate operand using the value of a part of the instruction, it may provide one or more register fields explicitly pointing to either general-purpose registers or special-purpose registers (floating-point registers, for example). The instruction can use suggested registers identified by an opcode field as operands. Usage can utilize memory locations for operands. A memory location of an operand can be provided by a register, an immediate field, or a combination of registers and an immediate field, as exemplified by the long shift feature of the /z Architecture, where the instruction defines a base register, a register index and an immediate field (offset field) that are added together to give the address of the operand in memory, for example. The location here typically suggests a location in main memory (main storage) unless otherwise noted.
0135. Referring to FIG. 17C, a processor accesses storage using a load/storage unit 5060. The load/storage unit 5060 may perform a load operation by obtaining the address of the target operand in memory 5053 and loading the operand into a register 5059 or other memory location 5053 at target operand location in memory 5053. Load/store unit 5060 may be speculative and may access memory in a sequence out of order with respect to the instruction sequence, however, the load/store unit must keep the appearance to programs that instructions were executed in order. The 5060 load/storage unit can communicate with 5059 general registers, 5056 decode/send unit, 5053 cache/memory interface, or other 5083 elements and comprises various register circuits, 5085 ALU, and 5090 control logic to calculate storage addresses and provide pipeline sequencing (chaining of instructions) to keep operations in order. Some operations may be out-of-order, but the load/storage unit provides the functionality to make out-of-order operations appear in the program as if they were performed in order, as the subject matter experts know.
0136. Preferably, the addresses that an application program “sees” are often referred to as virtual addresses. Virtual addresses are also sometimes referred to as “logical addresses” and “effective addresses”. These virtual addresses are virtual in that they are redirected to the physical memory location by one or more dynamic address translation (DAT) technologies including, but not limited to, simply prefixing a virtual address with an offset value. , translation of the virtual address by means of one or more translation tables that preferably contain at least a segment table and a page table, alone or in combination, the segment table having an entry pointing to the page. In Architecture/z, a translation hierarchy is provided including a first region table, a second region table, a third region table, a segment table, and an optional page table. Address translation performance is generally enhanced by the use of associative memory (TLB), which comprises entries that map a virtual address to an associated physical memory location. Entries are created when the DAT translates a virtual address using translation tables. Subsequent use of the virtual address can then use the fast TLB entry instead of the slow sequential translation table accesses. TLB content can be managed by a variety of replacement algorithms including Least Recently Used (LRU).
0137. When the processor is a multiprocessor system, each processor is responsible for keeping shared resources, such as I/O, caches, TLBs, and memory, linked together for cohesion. Snooping technologies will typically be used to maintain cache cohesion. In a “snoop” environment, each cache line can be marked as being in any shared state, exclusive state, altered state, invalid state and the like in order to facilitate sharing.
0138. The 5054 I/O units (FIG. 16) provide a means for the processor to attach to peripheral devices including tape, disk, printers, video, and networks, for example. I/O drives are often presented to the computer program by software drivers. On mainframes such as IBM® System z, channel adapters and open system adapters are mainframe I/O units that provide communications between the operating system and peripheral devices.
0139. In addition, other types of computing environments may benefit from one or more aspects of the present invention. As an example, the environment may include an emulator (eg, software or other emulation mechanisms), in which a particular architecture (including, for example, instruction execution, architected functions such as address translation and architected registers) or a subset be emulated (eg, on a native computer system with processor and memory). In such an environment, one or more emulator emulation functions may implement one or more aspects of the present invention, even if the computer running the emulator has a different architecture than the capabilities being emulated. As an example, in emulation mode, the specific instruction or operation being emulated is decoded and an appropriate emulation function is built to implement the individual instruction or operation.
0140. In an emulation environment, a host computer includes, for example, memory for storing instructions and data; instruction fetch unit to fetch instructions from memory and optionally provide local intermediate storage for the fetched instruction; instruction decoding unit to receive the fetched instructions and determine the type of instructions that were fetched; and instruction execution unit to execute the instructions. Execution may include loading data into a register from memory; storing data back into memory from a register; or performing some sort of arithmetic or logical operation as determined by the decoding unit. In one example, each unit is implemented in software. For example, the operations being performed by the units are implemented as one or more subroutines within the emulator software.
0141. More particularly, on a mainframe, the architected machine instructions are used by programmers, usually “C” programmers these days, often through a compiler application. These instructions stored on the storage media can be executed natively on an IBM® Architecture/z Server, or alternatively on machines running other architectures. They can be emulated on existing and future IBM® mainframe servers and other IBM® machines (eg, Power Systems servers and System x® Servers. They can run on machines running Linux on a wide range of machines using hardware manufactured by IBM®, Intel®, AMD™ and others. In addition to running on hardware on the /z Architecture, Linux can be used as well as machines that use emulation by Hercules, UMX or FSI (Fundamental Software, Inc.), on which the execution it is usually in emulation mode In the emulation module, the emulation software is run by a native processor to emulate the architecture of an emulated processor.
0142. The native processor typically runs emulation software that comprises either firmware or a native operating system to perform the emulation of the emulated processor. The emulation software is responsible for fetching and executing instructions from the emulated processor architecture. Emulation software maintains an emulated program counter to track instruction boundaries. Emulation software can fetch one or more instructions from the emulated machine at a time and convert them to a corresponding group of native machine instructions for execution by the native processor. Converted instructions can be captured for faster conversion. Despite this, the emulation software must maintain the architecture rules of the emulated processor's architecture in order to ensure the correct operation of operating systems and applications developed for the emulated processor. In addition, the emulation software must provide features identified by the architecture of the emulated processor including, without limitation, control registers, general purpose registers, floating point registers, dynamic address translation function including segment tables and pages, for example, interrupt mechanisms, context switch mechanisms, TOD (time of day) clocks, and interfaces architected to I/O subsystems so that an operating system or application program designed to run on the emulated processor can be executed on the native processor that has the emulation software.
0143. A specific emulated instruction is decoded and a subroutine is designated to perform the function of the individual instruction. An emulation software function is implemented that emulates the function of an emulated processor, for example, in a “C” subroutine or driver, or some other method of providing a driver for the specific hardware as experts will be able to deduce from the description of the preferred embodiment. Various software and hardware emulation patents including, without limitation, U.S. Patent No. 5,551,013 entitled “Multiprocessor for Hardware Emulation” by Beausoleil et al.; and U.S. Patent No. 6,009,261 entitled “Proprocessing of Stored Target Routines for Emulating Incompatible Instructions on a Target Processor” by Scalzi et al; and U.S. Patent No. 5,574,873 entitled “Decoding Guest Instruction to Directly Access Emulation Routines that Emulate the Guest Instructions” by Davidian et al; and US Patent Letter No. 6,308,255 entitled “Symmetrical Multiprocessing Bus and Chipset Used for Coprocessor Support Allowing Non-Native Code to Run in a System” of non-native code on a system), by Gorishek et al; and US Patent Letter No. 6,463,582 entitled “Dynamic Optimizing Object Code Translator for Architecture Emulation and Dynamic Optimizing Object Code Translation Method” object code), by Lethin et al; and US Patent No. 5,790,828 entitled “Method for Emulating Guest Instructions on a Host Computer Through Dynamic Recompilation of Host Instructions” by Eric Traut , each of which is incorporated herein by reference in its entirety; and many others, illustrate a variety of known ways to achieve instruction format emulation architected for a different machine for a meta machine, available to those skilled in the art.
0144. In FIG. 18, an example emulated host computer system 5092 is provided that emulates a host computer system 5000' of a host architecture. In the 5092 emulated host computer system, the host processor (CPU) 5091 is an emulated host processor (or virtual host processor) and comprises an emulation processor 5093 that has an instruction set of a native architecture different from that of the computer's 5091 processor. host 5000'. The emulated host computer system 5092 has memory 5094 accessible to the emulation processor 5093. In the exemplified embodiment, memory 5094 is partitioned into a host computer memory portion 5096 and an emulation routines portion 5097. Host computer memory 5096 is available for 5092 emulated host computer programs according to the host computer's architecture. The 5093 emulation processor executes native instructions from a set of instructions architected of a different architecture than the 5091 emulated processor, the native instructions being taken from the memory of 5097 emulation routines, and can access a host instruction for executing a program in memory. of host computer 5096 employing one or more instructions obtained in a sequence & access/decoding routine that can decode the host instruction(s) to determine a native instruction execution routine for emulating the function of the accessed host instruction . Other resources that are defined for the system architecture of the host computer 5000' can be emulated by resource-architected routines, including general purpose registers, control registers, dynamic address translation, and system I/O and processor cache support, for example. Emulation routines can also take advantage of functions available on the 5093 emulation processor (such as general registers and dynamic virtual address translation) to improve the performance of the emulation routines. Special hardware and off-load mechanisms may also be provided to assist the processor 5093 in emulating the function of the host computer 5000'.
0145. The terminology used herein is intended to describe specific embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “a” and “the” and also include the plural forms, unless the context clearly indicates otherwise. It is further understood that the terms "comprises" and/or "comprising", when used in this specification, determine the presence of established attributes, integers, steps, operations, elements, and/or components, but do not exclude the presence or addition of a or more attributes, integers, steps, operations, elements, components and/or groups other than those.
0146. The structures, materials, corresponding and equivalent acts of all means and steps plus the elements of function in the claims below, if any, are intended to include any structure, material or act for performing the function in combination with other elements claimed as specifically claimed. Description of one or more aspects of the present invention have been presented for purposes of illustration and description, but are not intended to be exhaustive or limiting of the invention as disclosed. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the invention. The embodiment has been chosen and described in order to better explain the principles of the invention and its practical application, and to enable other experts to understand the invention for various embodiments with various modifications as may suit the particular use contemplated.Chapter 23. Vector Chain Instructions Vector Chain Instructions
0147. Unless otherwise specified, all operands are vector register operands. The “V” in assembler syntax designates a vector operand.
VECTOR FIND ANY EQUAL (Vector finds any equal)

0148. Starting from left to right, any unsigned binary integer element of the second operand is compared for equality with each unsigned binary integer element of the third operand and optionally zero if the Zero Search signal is set in the M5 field.
0149. If the Result Type (RT) flag in field M5 is zero, then for each element in the second operand that matches any element in the third operand, or optionally zero, the bit positions of the corresponding element in the first operand are registered to one, otherwise they are set to zero. If the Result Type (RT) flag in field M5 is one, then the byte index of the leftmost element in the second operand that matches an element in the third operand, or zero, is stored in byte seven of the first operand.
0150. Each instruction has an Extended Mnemonic section that describes the recommended extended mnemonics and their corresponding machine assembler syntax.
0151. Programming Note: For all instructions that optionally set the condition code, performance may degrade if the condition code is recorded.
0152. If the Result Type (RT) flag in the M5 field is one and no equal bytes are found, or zero, if the zero search flag is set, an index equal to the number of bytes in the array is stored in byte seven of the first operand.
0153. The M5 field specifies the element size (ES) control. The ES control specifies the size of elements in vector register operands. If a reserved value is specified, a specification exception is recognized.0 - Byte1 - Halfword2 - Word3-15 - Reserved
0154. The M5 field has the following format:

0155. The M5 field bits are defined as follows: Result Type (RT): If zero, each resulting element is a mask of the entire range of comparisons in that element. If one, a byte index is stored in byte seven of the first operand and zeros are stored in all other elements.• Zero Search (ZS): If one, each element of the second operand is also compared to zero. Condition (CC): If zero, the condition code is not recorded and remains unchanged. If one, the condition code is logged as specified in the next section. Special conditions
0156. A specification exception is recognized and no further action is taken if any of the following occurs:1. The M4 field contains a value of 3-15.2. Bit 0 of field M5 are not zero.Resultant Condition Code:
0157. If the CC flag is zero, the code remains unchanged.
0158. If the CC flag is one, the code is recorded as follows:0 If the ZS bit is set, there are no matches on an element lower than zero in the second operand.1 Some elements of the second operand match at least one element in the third operand.2 All elements of the second operand match at least one element in the third operand.3 No element in the second operand matches any elements in the third operand. Program Exceptions:
0159. Data with DXC FE, Vector Register• Operation if vector-extension feature is not installed• Specification (Reserved ES value) Transaction Constraint Extended Mnemonics:
VECTOR FIND ELEMENT EQUAL

0160. Starting from left to right, the unsigned binary integer elements of the second operand are compared with the unsigned binary integer elements of the third operand. If two elements are equal, the byte index of the first byte of the leftmost equal element is placed in byte seven of the first operand. Zeros are stored in the remaining bytes of the first operand. If no equal bytes are found, or zero if a zero comparison is recorded, then an index equal to the number of bytes in the array is stored in byte seven of the first operand. Zeros are stored in the remaining bytes.
0161. If Zero Search (ZS) is registered in field M5, then each element in the second operand is also compared for equality with zero. If a zero element is found in the second operand before any other equal elements are found in the second and third operands, the byte index of the first byte of the found element zero is stored in byte seven of the first operand and zeros are stored in all others. byte locations. If the Condition Code Register (CC) flag is one, then the condition code is set to zero.
0162. The M4 field specifies the element size control (ES). The ES control specifies the element size of vector register operands. If a reserved value is specified, a specification exception is recognized.0 - Byte1 - Half Word2 - Word3-15 - Reserved
0163. Field M5 has the following format:

0164. The M5 field bits are defined as follows:• Reserved: Bits 0-1 and must be zero. Otherwise, a specification exception is recognized.• Zero Search (ZS): If one, each element of the second operand is also compared to zero.• Condition Code Register (CC): If zero, the condition code remains unchanged . If one, the condition code is logged as specified in the next section. Special conditions
0165. A specification exception is recognized and no further action is taken if any of the following occurs:1. The M4 field contains a value of 3-15.2. Bits 0-1 of the M5 field are not zero. Resulting Condition Code:
0166. If bit 3 of the M5 field is set to one, the code is recorded as follows: 0 If the zero comparison bit is set, the comparison has detected a zero element in the second operand in an element with an index less than any equal comparisons .1 The comparison detected a match between the second and third operands on some element. If the zero comparison bit is set, this combination occurred in an element with an index less than or equal to the zero comparative element.2 -3 No equal compared element
0167. If bit 3 of field M5 is zero, the code remains unchanged.Program Exceptions:• Data with DXC FE, Vector Register• Operation if vector-extension feature is not installed• Specification (Reserved ES value)• Restriction of Extended Mnemonics Transaction:
Programming Notes:1. A byte index is always stored in the first operand for any element's size. For example, if the element size was given half word and the 2nd indexed half word matched as equal, then a byte index of 4 will be stored.2. The third operand must not contain elements with a value of zero. If the third operand does contain a zero and matches a zero element in the second operand before any other equal comparisons, the condition code is registered without regard to setting the zero comparison bit. VECTOR FIND ELEMENT NOT EQUAL (Vector Find Element not Equal)

0168. Starting from left to right, the unsigned binary integer elements of the second operand are compared with the corresponding unsigned binary integer elements of the third operand. If two elements are not equal, the byte index of the leftmost non-equal element is placed in byte seven of the first operand and zeros are stored for all other bytes. If the Condition Code Register (CC) bit in the M5 field is set to one, the condition code is registered to indicate which operand was the largest. If all elements are equal, then the abyte index equal to the size of the array is placed in byte seven of the first operand and zeros are placed in all other byte locations. If the CC bit is one, condition code three is recorded.
0169. If the zero seek bit (ZS) is registered in the M5 field, each element in the second operand is also compared for zero equality. If a zero element is found in the second operand before any element of the second operand is found to be unequal, the byte index of the first byte of the element found to be zero is stored in byte seven of the first operand. Zeros are stored in all other bytes and condition code 0 is logged.
0170. The M4 field specifies the element size (ES) control. The ES control specifies the element size of vector register operands. If a reserved value is specified, a specification exception is recognized.0 - Byte 1 - Half Word2 - Word3-15 - Reserved
0171. Field M5 has the following format:

0172. The M5 field bits are defined as follows: • Zero Search (ZS): If one, each element of the second operand is also compared to zero. • Condition Code Register (CC): If zero, the condition code is not registered and remains unchanged. If one, the condition code is logged as specified in the next section. Special conditions
0173. A specification exception is recognized and no further action is taken if any of the following occurs:3. The M4 field contains a value of 3-15.4. Bits 0-1 of the M5 field are not zero. Resulting Condition Code:
0174. If bit 3 of field M5 is set to one, the code is recorded as follows: 0 If zero, the comparison bit is set, the comparison has detected a zero element in both operands at an index element lower than any unequal compares1 A non-compatible element was detected and the element in VR2 is smaller than the element in VR32 A non-compatible element was detected and the element in VR2 is greater than the element in VR33 All elements compared as equal, and if the zero comparison bit was registered, no zero element of the second operand was found.
0175. If bit 3 of field M5 is zero, the code remains unchanged.Program Exceptions:• Data with DXC FE, Vector Register• Operation if vector-extension feature is not installed• Specification (Reserved ES value)• Restriction of Extended Mnemonics Transaction:
VECTOR String RANGE COMPARISON (Vector String RangeCompare)

0176. Starting from left to right, unsigned binary integer elements in the second operand are compared to ranges of values defined by even-odd pairs of elements in the third and fourth operands. Combined with the fourth operand control values define the range of comparisons to be performed. If an element matches any of the ranges specified by the third and fourth operands, it is considered a match.
0177. If the Result Type (RT) flag in the M6 field is zero, the bit positions of the element in the first operand corresponding to the element being compared in the second operand are determined one if the element matches any of the ranges, otherwise , are set to zero.
0178. If the Result Type (RT) flag in field M6 is set to one, the byte index of the first element of the second operand that matches any of the ranges specified by the third and fourth operands, or a zero comparison, if the ZS flag is set to one, it is placed in byte seven of the first operand and zeros are stored in the remaining bytes. If no elements match, then an index equal to the number of bytes in an array is placed in byte 7 of the first operand and zeros are stored in the remaining bytes.
0179. The Zero Search (ZS) flag in the M6 field, if set to one, will add a zero comparison of the elements of the second operand to the ranges provided by the third and fourth operands. If a zero comparison on an element indexed lower than any other true comparison, then the condition code is registered zero.
0180. The operands contain elements of the size specified by the Element Size control in the M5 field.
0181. The elements of the fourth operand have the following format: If ES equals 0:
If ES equals 1:
If ES equals 2:

0182. The bits in the elements of the fourth operand are defined as follows: • Equals (EQ): When a comparison is made for equality • Greater than (GT): When a greater than comparison is made • Less than (LT): When a comparison of less than • All other bits are reserved and must be zero to ensure future compatibility.
0183. Control bits can be used in any combination. If none of the bits are registered, the comparison will always produce a false result. If all bits are registered, the comparison will always produce a true result.
0184. The M5 field specifies the element size control (ES). The ES control specifies the element size of vector register operands. If a reserved value is specified, a specification exception is recognized.0 - Byte1 - Half Word2 - Word3-15 - Reserved
0185. Field M6 has the following format:
Invert Result (IN): If zero, the comparison proceeds as a pair of values in the control vector. If one, the result of pairwise comparisons in the ranges is inverted.Result Type (RT): If zero, each resulting element is a mask of all comparison ranges in that element. If one, an index is stored in byte seven of the first operating. Zeros are stored in the remaining bytes. • Zero Search (ZS): If one, each element of the second operand is also compared to zero. • Condition Code Register (CC): If zero, the condition code is not determined and remains unchanged. If one, the condition code is specified in the next section. Special conditions
0186. A specification exception is recognized and no further action is taken if any of the following occurs:1. Field M4 contains a value of 3-15.Resultant Condition Code:0 If ZS=1 and a zero is found in an indexed element lower than any comparison1 Comparison found2 -3 No comparison foundProgram Exceptions:• Data with DXC FE, Vector Register• Operation if vector-extension feature is not installed• Specification (Reserved ES value)• Transaction Constraint Extended Mnemonics:
Figure 23-1ES=1,ZS=0 VR1(a) Results with RT=0VR1(b) Results with RT=1LOAD Count to Block Boundary

0187. A 32-bit unsigned binary integer containing the number of bytes possible to load from the location of the second operand without exceeding a specified block limit, fixed at sixteen, is placed in the first operand.
0188. The offset is treated as a 12-bit unsigned integer.
0189. The address of the second operand is not used to address data.
0190. The M3 field specifies a code that is used to signal the CPU as to the size of the block limit to compute the number of possible bytes loaded. If a reserved value is specified then a specification exception is recognized.Code Limit0 64 Bytes1 128 Bytes 2 256 Bytes3 512 Bytes4 1 K-Byte5 2 K-Bytes6 4 K-Bytes7-15 ReservedResultant Condition Code:7 Operand one is sixteen1 -2 -3 Operand one is less than sixteenProgram Exceptions:• Operation if vector-extension feature is not installed• SpecificationProgramming Note: LOAD COUNTER TO BLOCK LIMIT is expected to be used in conjunction with LOAD OF VECTOR TO BLOCK LIMIT to determine the number of bytes that have been loaded VECTOR LOAD GR OF ELEMENT VR

0191. The element of the third operand of size specified by the value ES in field M4 and indexed by the address of the second operand is placed in the location of the first operand. The third operand is a vector register. The first operand is a general register. If the index specified by the address of the second operand is greater than the highest numbered element in the third operand of the specified size of the element, the data in the first operand is unpredictable. If the vector register element is less than a double word, the element is right-aligned in the 64-bit general register and the remaining bits are padded with zeros.
0192. The address of the second operand is not used to address data, instead the rightmost 12 bits of the address are used to specify the index of an element within the second operand.
0193. The M5 field specifies the element size (ES) control. The ES control specifies the element size of vector register operands. If a reserved value is specified, a specification exception is recognized.0 - Byte1 - Half Word2 - Word3 - Double Word 4-15 - Reserved unchangedResultant Condition Code: Code is unchangedProgram Exceptions:Data with DXC FE, Register OperationVector if vector-extension feature is not installedSpecification (ES value Reserved)Transaction Constraint Extended Mnemonics:
VECTOR LOAD TO BLOCK BOUNDARY (Vector Load to BlockBoundary)

0194. The first operand is loaded starting at the zero-indexed byte element with bytes from the second operand. If a boundary condition is encountered, the remainder of the first operand is unpredictable. Access exceptions are not recognized on unloaded bytes. The offset for VLBB is treated as a 12-bit unsigned integer. The M3 field specifies a code that is used to signal the CPU about the limit size of the block to be loaded. If a reserved value is specified, a specification exception is recognized. Code Limit0 64 Bytes1 128 Bytes2 256 Bytes3 512 Bytes4 1K-Byte5 2K-Bytes6 4K-Bytes7-15 ReservedResultant Condition Code: The code remains unchanged.Program Exceptions: • Access (fetch, operand 2)• Data with DXC FE, Vector Register• Operation if vector-extension feature is not installed Specification (Reserved Block Boundary Code)Transaction RestrictionProgramming Notes:1. In certain circumstances data may be loaded past the block boundary. However, this will only happen if there are no access exceptions on that data. VECTOR STORAGE (Vector Store)

0195. The bit value of 128 in the first operand is stored in the storage location specified by the second operand. The offset for VST is treated as a 12-bit unsigned integer.Resultant Condition Code: The code remains unchanged.Program Exceptions:• Access (storage, operand 2)• Data with DXC FE, Vector Register• Operation if resource vector-extension is not installed • Specification (Reserved Block Boundary Code) • Transaction Constraint VECTOR STORAGE WITH EXTENSION (Vector Store with Length)

0196. Starting from left to right, the bytes of the first operand are stored in the location of the second operand. The specified third operand general register contains a 32-bit unsigned integer containing a value representing the highest indexed byte for storage. If the third operand contains a value greater than or equal to the highest byte index of the array, all bytes of the first operand are stored.
0197. Access exceptions are only recognized on stored bytes. The offset for VECTOR STORAGE WITH EXTENSION is treated as a 12-bit unsigned integer.Resultant Condition Code: The condition code remains unchanged.Program Exceptions:• Access (storage, operand 2)• Data with DXC FE, Register Vector• Operation if vector-extension feature is not installed• Specification (Reserved Block Boundary Code) • Transaction Restriction Description RXB
0198. All vector instructions have a field in bits 36-40 of the instruction labeled RXB. This field contains the most significant bits for all vector registers designated operands. Bits for register assignments not specified by the instruction are reserved and must be set to zero; otherwise, the program may not operate compliantly in the future. The most significant bit is concatenated to the left of the four-bit register designation to create the five-bit vector register designation.
0199. The bits are defined as follows:0 Most significant bit for the vector register designation in bits 8-11 of the instruction.1 Most significant bit for the designation of vector register in bits 12-15 of the instruction.2 Most significant bit for the vector register assignment in bits 16-19 of the instruction.3 Most significant bit for the vector register assignment in bits 32-35 of the instruction.Vector Enable Control
0200. Vector registers and instructions can only be used if both the vector activation control (bit 46) and the AFP-control register (bit 45) in control register zero are set to one. If the vector feature is installed and a vector instruction is executed without activating the bit sets, a data exception with DXC FE hex will be recognized. If the vector feature is not installed, an operation exception will be recognized.
权利要求:
Claims (13)
[0001]
1. METHOD FOR TRANSFORMING INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT characterized by: obtaining, by a processor, from a first instruction defined for a first computer architecture, a non-contiguous specifier, such non-contiguous specifier having a first part and a second part, wherein obtaining comprises obtaining the first part of a first field of the instruction and the second part of a second field of the instruction, the first field being separate from the second field; Generating a contiguous specifier using the first part and the second part, generation using one or more rules based on the opcode of the first instruction; eUsing the contiguous specifier to indicate a resource to be used in executing a second instruction, the second instruction being defined for a second computer architecture different from the first computer architecture and emulating a function of the first instruction.
[0002]
2. METHOD FOR TRANSFORMING INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 1, characterized in that: the processor includes an emulator, the first part includes a first one or more bits, the second part includes a second one or more bits , and the generation comprises concatenating the second one or more bits with the first one or more bits to form the contiguous specifier, where the second one or more bits are the most significant bits of the contiguous specifier.
[0003]
3. METHOD FOR TRANSFORMING INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 2, characterized in that: the first field has an operand position associated with it, and the second or more bits are a subset of a plurality of bits of the second field, and obtaining comprises selecting the second one or more bits of the plurality of bits of the second field based on the position of the operand of the first field.
[0004]
4. METHOD TO TRANSFORM INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 3, characterized in that: the position of the operand of the first field is as a first operand, and the second or more bits are selected from a more to the left of the second field.
[0005]
5. METHOD FOR TRANSFORMING INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 1, characterized in that: the first field consists of a record field, the second field consists of an extension field, the first part consists of the plurality of bits of the register field, the second part consists of a bit of the extension field in place of the instruction corresponding to the register field, and the generation consists of concatenating the bit of the extension field with the bits of the register field to provide the specifier contiguous.
[0006]
6. METHOD TO TRANSFORM INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 1, characterized by: the use of the contiguous specifier to indicate a resource includes the use of the contiguous specifier to map to a register to be used by the second instruction.
[0007]
7. METHOD TO TRANSFORM INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 6, characterized in that: the record mapped by the contiguous specifier has the same value as the contiguous specifier.
[0008]
8. METHOD TO TRANSFORM INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 6, characterized in that: the record mapped by the contiguous specifier has a different value from the contiguous specifier.
[0009]
9. METHOD FOR TRANSFORMING INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 1, characterized in that: the first computer architecture includes an instruction set composed of first instructions with register fields to access a subsection of a register space of the first computer architecture, and second instructions with non-contiguous record fields to access the subsection and remaining subsections of the record space, with the first instructions being prevented from accessing the remaining subsections.
[0010]
10. METHOD FOR TRANSFORMING INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 1, characterized in that: the first field consists of a record field, the second field consists of an extension field, the first part consists of a plurality of bits of the register field, the second part consisting of a bit of the extension field at a location of the instruction corresponding to the register field, and the generation comprises concatenating the bit of the extension field with the bits of the register field to provide the contiguous specifier, and also comprising the steps of obtaining, by the processor, the first instruction, another non-contiguous specifier, this one having another first part and another second part, where the obtaining comprises obtaining the other first part of another first field of the instruction and another second part of another bit of the extension field, the other first field being separate from the first field and the extension field; generation of another contiguous specifier using the other first part and the other bit, such generation using one or more rules based on the opcode of the first instruction; and the first instruction; and the first instruction; and using another contiguous specifier to indicate a resource to be used in executing the second instruction of the first instruction.
[0011]
11. METHOD FOR TRANSFORMING INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT, according to claim 1, characterized in that: the first computer architecture comprises an instruction set architecture and the first instruction is defined for the architecture of the first instruction set, the second computer architecture comprises an instruction set architecture and the second instruction set architecture is defined for the architecture of the second instruction set, and the architecture of the second instruction set is an instruction set architecture different from the architecture of the first set of instructions.
[0012]
12. PHYSICAL SUPPORT TO TRANSFORM INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT characterized by: containing recorded the methods claimed in claims 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11 respectively.
[0013]
13. EQUIPMENT TO TRANSFORM INSTRUCTION SPECIFICATIONS OF A COMPUTATIONAL ENVIRONMENT characterized by comprising: Memory, eProcessor in communication with memory, in which the equipment is configured to perform the methods claimed respectively in claims 2, 3, 4, 5, 6, 7 , 8, 9, 10 and 11.
类似技术:
公开号 | 公开日 | 专利标题
BR112014022638B1|2022-01-04|METHOD, PHYSICAL SUPPORT AND EQUIPMENT TO TRANSFORM INSTRUCTION SPECIFICATIONS IN A COMPUTATIONAL ENVIRONMENT
US9959118B2|2018-05-01|Instruction to load data up to a dynamically determined memory boundary
US9946542B2|2018-04-17|Instruction to load data up to a specified memory boundary indicated by the instruction
US9710267B2|2017-07-18|Instruction to compute the distance to a specified memory boundary
JP6184426B2|2017-08-23|Method, system, and computer program for copying character data having a termination character between memory locations |
CA2867117C|2020-02-18|Finding the length of a set of character data having a termination character
US10228946B2|2019-03-12|Reading a register pair by writing a wide register
BR112014022726B1|2022-02-15|METHOD OF EXECUTING A MACHINE INSTRUCTION ON A CENTRAL PROCESSING UNIT, COMPUTER-READable MEDIUM AND COMPUTER SYSTEM
BR112014022727B1|2021-10-13|INSTRUCTION FOR LOADING DATA TO A SPECIFIC MEMORY BORDER INDICATED BY THE INSTRUCTION
同族专利:
公开号 | 公开日
US9454374B2|2016-09-27|
EP2769301A1|2014-08-27|
PL2769301T3|2020-06-01|
BR112014022638A2|2017-06-20|
MX340050B|2016-06-22|
IL232817D0|2014-07-31|
HUE048409T2|2020-07-28|
KR101643065B1|2016-07-26|
ES2779033T3|2020-08-13|
EP2769301A4|2014-11-19|
CN104169877A|2014-11-26|
SI2769301T1|2020-06-30|
TW201403468A|2014-01-16|
ZA201406612B|2016-05-25|
IL232817A|2017-07-31|
SG11201404825SA|2014-09-26|
US20130246768A1|2013-09-19|
RU2568241C2|2015-11-10|
DK2769301T3|2020-03-16|
JP6108362B2|2017-04-05|
WO2013136144A1|2013-09-19|
EP2769301B1|2020-02-19|
PT2769301T|2020-03-26|
MX2014010948A|2014-10-13|
LT2769301T|2020-05-25|
CA2867115C|2020-12-08|
HK1201354A1|2015-08-28|
US20130246766A1|2013-09-19|
US9280347B2|2016-03-08|
AU2012373735B2|2016-06-02|
CA2867115A1|2013-09-19|
HRP20200393T1|2020-06-12|
RU2012148583A|2014-05-20|
CN104169877B|2017-10-13|
KR20140104974A|2014-08-29|
JP2015514242A|2015-05-18|
TWI533207B|2016-05-11|
AU2012373735A1|2014-09-11|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

JPH0470662B2|1985-07-31|1992-11-11|Nippon Electric Co|
US5073864A|1987-02-10|1991-12-17|Davin Computer Corporation|Parallel string processor and method for a minicomputer|
US5222225A|1988-10-07|1993-06-22|International Business Machines Corporation|Apparatus for processing character string moves in a data processing system|
JPH0831032B2|1990-08-29|1996-03-27|三菱電機株式会社|Data processing device|
US5465374A|1993-01-12|1995-11-07|International Business Machines Corporation|Processor for processing data string by byte-by-byte|
AU6629894A|1993-05-07|1994-12-12|Apple Computer, Inc.|Method for decoding guest instructions for a host computer|
WO1994029790A1|1993-06-14|1994-12-22|Apple Computer, Inc.|Method and apparatus for finding a termination character within a variable length character string or a processor|
JPH0721034A|1993-06-28|1995-01-24|Fujitsu Ltd|Character string copying processing method|
US5509129A|1993-11-30|1996-04-16|Guttag; Karl M.|Long instruction word controlling plural independent processor operations|
US6185629B1|1994-03-08|2001-02-06|Texas Instruments Incorporated|Data transfer controller employing differing memory interface protocols dependent upon external input at predetermined time|
US5551013A|1994-06-03|1996-08-27|International Business Machines Corporation|Multiprocessor for hardware emulation|
WO1996010103A1|1994-09-27|1996-04-04|Nkk Corporation|Galvanized steel sheet and process for producing the same|
US5790825A|1995-11-08|1998-08-04|Apple Computer, Inc.|Method for emulating guest instructions on a host computer through dynamic recompilation of host instructions|
US5812147A|1996-09-20|1998-09-22|Silicon Graphics, Inc.|Instruction methods for performing data formatting while moving data between memory and a vector register file|
US5931940A|1997-01-23|1999-08-03|Unisys Corporation|Testing and string instructions for data stored on memory byte boundaries in a word oriented machine|
DE69804495T2|1997-11-24|2002-10-31|British Telecomm|INFORMATION MANAGEMENT AND RECOVERY OF KEY TERMS|
US6009261A|1997-12-16|1999-12-28|International Business Machines Corporation|Preprocessing of stored target routines for emulating incompatible instructions on a target processor|
US6041402A|1998-01-05|2000-03-21|Trw Inc.|Direct vectored legacy instruction set emulation|
US6094695A|1998-03-11|2000-07-25|Texas Instruments Incorporated|Storage buffer that dynamically adjusts boundary between two storage areas when one area is full and the other has an empty data register|
US6334176B1|1998-04-17|2001-12-25|Motorola, Inc.|Method and apparatus for generating an alignment control vector|
US6308255B1|1998-05-26|2001-10-23|Advanced Micro Devices, Inc.|Symmetrical multiprocessing bus and chipset used for coprocessor support allowing non-native code to run in a system|
US20020147969A1|1998-10-21|2002-10-10|Richard A. Lethin|Dynamic optimizing object code translator for architecture emulation and dynamic optimizing object code translation method|
JP3564395B2|1998-11-27|2004-09-08|松下電器産業株式会社|Address generation device and motion vector detection device|
US6192466B1|1999-01-21|2001-02-20|International Business Machines Corporation|Pipeline control for high-frequency pipelined designs|
US8127121B2|1999-01-28|2012-02-28|Ati Technologies Ulc|Apparatus for executing programs for a first computer architechture on a computer of a second architechture|
US6189088B1|1999-02-03|2001-02-13|International Business Machines Corporation|Forwarding stored dara fetched for out-of-order load/read operation to over-taken operation read-accessing same memory location|
US6499116B1|1999-03-31|2002-12-24|International Business Machines Corp.|Performance of data stream touch events|
US6802056B1|1999-06-30|2004-10-05|Microsoft Corporation|Translation and transformation of heterogeneous programs|
US6381691B1|1999-08-13|2002-04-30|International Business Machines Corporation|Method and apparatus for reordering memory operations along multiple execution paths in a processor|
US6513109B1|1999-08-31|2003-01-28|International Business Machines Corporation|Method and apparatus for implementing execution predicates in a computer processing system|
US6449706B1|1999-12-22|2002-09-10|Intel Corporation|Method and apparatus for accessing unaligned data|
JP2001236249A|2000-02-24|2001-08-31|Nec Corp|Device and method for managing memory|
US6625724B1|2000-03-28|2003-09-23|Intel Corporation|Method and apparatus to support an expanded register set|
US6349361B1|2000-03-31|2002-02-19|International Business Machines Corporation|Methods and apparatus for reordering and renaming memory references in a multiprocessor computer system|
US6701424B1|2000-04-07|2004-03-02|Nintendo Co., Ltd.|Method and apparatus for efficient loading and storing of vectors|
US6408383B1|2000-05-04|2002-06-18|Sun Microsystems, Inc.|Array access boundary check by executing BNDCHK instruction with comparison specifiers|
JP3801987B2|2000-10-18|2006-07-26|コーニンクレッカフィリップスエレクトロニクスエヌヴィ|Digital signal processor|
US7487330B2|2001-05-02|2009-02-03|International Business Machines Corporations|Method and apparatus for transferring control in a computer system with dynamic compilation capability|
US7100026B2|2001-05-30|2006-08-29|The Massachusetts Institute Of Technology|System and method for performing efficient conditional vector operations for data parallel architectures involving both input and conditional vector values|
JP3900863B2|2001-06-28|2007-04-04|シャープ株式会社|Data transfer control device, semiconductor memory device and information equipment|
US6839828B2|2001-08-14|2005-01-04|International Business Machines Corporation|SIMD datapath coupled to scalar/vector/address/conditional data register file with selective subpath scalar processing mode|
US6907443B2|2001-09-19|2005-06-14|Broadcom Corporation|Magnitude comparator|
US6570511B1|2001-10-15|2003-05-27|Unisys Corporation|Data compression method and apparatus implemented with limited length character tables and compact string code utilization|
US20100274988A1|2002-02-04|2010-10-28|Mimar Tibet|Flexible vector modes of operation for SIMD processor|
US7089371B2|2002-02-12|2006-08-08|Ip-First, Llc|Microprocessor apparatus and method for prefetch, allocation, and initialization of a block of cache lines from memory|
US7441104B2|2002-03-30|2008-10-21|Hewlett-Packard Development Company, L.P.|Parallel subword instructions with distributed results|
US7373483B2|2002-04-02|2008-05-13|Ip-First, Llc|Mechanism for extending the number of registers in a microprocessor|
US7376812B1|2002-05-13|2008-05-20|Tensilica, Inc.|Vector co-processor for configurable and extensible processor architecture|
US20040049657A1|2002-09-10|2004-03-11|Kling Ralph M.|Extended register space apparatus and methods for processors|
US6918010B1|2002-10-16|2005-07-12|Silicon Graphics, Inc.|Method and system for prefetching data|
US7103754B2|2003-03-28|2006-09-05|International Business Machines Corporation|Computer instructions for having extended signed displacement fields for finding instruction operands|
US20040215924A1|2003-04-28|2004-10-28|Collard Jean-Francois C.|Analyzing stored data|
US7035986B2|2003-05-12|2006-04-25|International Business Machines Corporation|System and method for simultaneous access of the same line in cache storage|
US20040250027A1|2003-06-04|2004-12-09|Heflinger Kenneth A.|Method and system for comparing multiple bytes of data to stored string segments|
US7610466B2|2003-09-05|2009-10-27|Freescale Semiconductor, Inc.|Data processing system using independent memory and register operand size specifiers and method thereof|
US7904905B2|2003-11-14|2011-03-08|Stmicroelectronics, Inc.|System and method for efficiently executing single program multiple data programs|
GB2411973B|2003-12-09|2006-09-27|Advanced Risc Mach Ltd|Constant generation in SMD processing|
US20060095713A1|2004-11-03|2006-05-04|Stexar Corporation|Clip-and-pack instruction for processor|
US7421566B2|2005-08-12|2008-09-02|International Business Machines Corporation|Implementing instruction set architectures with non-contiguous register file specifiers|
US9436468B2|2005-11-22|2016-09-06|Intel Corporation|Technique for setting a vector mask|
US8010953B2|2006-04-04|2011-08-30|International Business Machines Corporation|Method for compiling scalar code for a single instruction multiple data execution engine|
US7565514B2|2006-04-28|2009-07-21|Freescale Semiconductor, Inc.|Parallel condition code generation for SIMD operations|
CN101097488B|2006-06-30|2011-05-04|2012244安大略公司|Method for learning character fragments from received text and relevant hand-hold electronic equipments|
US9069547B2|2006-09-22|2015-06-30|Intel Corporation|Instruction and logic for processing text strings|
US7536532B2|2006-09-27|2009-05-19|International Business Machines Corporation|Merge operations of data arrays based on SIMD instructions|
US7991987B2|2007-05-10|2011-08-02|Intel Corporation|Comparing text strings|
CN101755265A|2007-05-21|2010-06-23|茵科瑞蒂梅尔有限公司|Interactive message editing system and method|
US20090063410A1|2007-08-29|2009-03-05|Nils Haustein|Method for Performing Parallel Data Indexing Within a Data Storage System|
US7895419B2|2008-01-11|2011-02-22|International Business Machines Corporation|Rotate then operate on selected bits facility and instructions therefore|
US7870339B2|2008-01-11|2011-01-11|International Business Machines Corporation|Extract cache attribute facility and instruction therefore|
US7739434B2|2008-01-11|2010-06-15|International Business Machines Corporation|Performing a configuration virtual topology change and instruction therefore|
US7877582B2|2008-01-31|2011-01-25|International Business Machines Corporation|Multi-addressable register file|
EP2245529A1|2008-02-18|2010-11-03|Sandbridge Technologies, Inc.|Method to accelerate null-terminated string operations|
DK176835B1|2008-03-07|2009-11-23|Jala Aps|Method of scanning, medium containing a program for carrying out the method and system for carrying out the method|
US8386547B2|2008-10-31|2013-02-26|Intel Corporation|Instruction and logic for performing range detection|
US20120023308A1|2009-02-02|2012-01-26|Renesas Electronics Corporation|Parallel comparison/selection operation apparatus, processor, and parallel comparison/selection operation method|
JP5471082B2|2009-06-30|2014-04-16|富士通株式会社|Arithmetic processing device and control method of arithmetic processing device|
US8595471B2|2010-01-22|2013-11-26|Via Technologies, Inc.|Executing repeat load string instruction with guaranteed prefetch microcode to prefetch into cache for loading up to the last value in architectural register|
JP2011212043A|2010-03-31|2011-10-27|Fujifilm Corp|Medical image playback device and method, as well as program|
US20110314263A1|2010-06-22|2011-12-22|International Business Machines Corporation|Instructions for performing an operation on two operands and subsequently storing an original value of operand|
US8972698B2|2010-12-22|2015-03-03|Intel Corporation|Vector conflict instructions|
US9009447B2|2011-07-18|2015-04-14|Oracle International Corporation|Acceleration of string comparisons using vector instructions|
US9280347B2|2012-03-15|2016-03-08|International Business Machines Corporation|Transforming non-contiguous instruction specifiers to contiguous instruction specifiers|
US9268566B2|2012-03-15|2016-02-23|International Business Machines Corporation|Character data match determination by loading registers at most up to memory block boundary and comparing|
US9454366B2|2012-03-15|2016-09-27|International Business Machines Corporation|Copying character data having a termination character from one memory location to another|
US9459864B2|2012-03-15|2016-10-04|International Business Machines Corporation|Vector string range compare|
US9459867B2|2012-03-15|2016-10-04|International Business Machines Corporation|Instruction to load data up to a specified memory boundary indicated by the instruction|
US9454367B2|2012-03-15|2016-09-27|International Business Machines Corporation|Finding the length of a set of character data having a termination character|
US9459868B2|2012-03-15|2016-10-04|International Business Machines Corporation|Instruction to load data up to a dynamically determined memory boundary|
US9715383B2|2012-03-15|2017-07-25|International Business Machines Corporation|Vector find element equal instruction|
US9710266B2|2012-03-15|2017-07-18|International Business Machines Corporation|Instruction to compute the distance to a specified memory boundary|
US9588762B2|2012-03-15|2017-03-07|International Business Machines Corporation|Vector find element not equal instruction|WO2012103373A2|2011-01-27|2012-08-02|Soft Machines, Inc.|Variable caching structure for managing physical storage|
EP2668565B1|2011-01-27|2019-11-06|Intel Corporation|Guest instruction to native instruction range based mapping using a conversion look aside buffer of a processor|
WO2012103253A2|2011-01-27|2012-08-02|Soft Machines, Inc.|Multilevel conversion table cache for translating guest instructions to native instructions|
WO2012103359A2|2011-01-27|2012-08-02|Soft Machines, Inc.|Hardware acceleration components for translating guest instructions to native instructions|
WO2012103367A2|2011-01-27|2012-08-02|Soft Machines, Inc.|Guest to native block address mappings and management of native code storage|
WO2012103245A2|2011-01-27|2012-08-02|Soft Machines Inc.|Guest instruction block with near branching and far branching sequence construction to native instruction block|
US9459868B2|2012-03-15|2016-10-04|International Business Machines Corporation|Instruction to load data up to a dynamically determined memory boundary|
US9459864B2|2012-03-15|2016-10-04|International Business Machines Corporation|Vector string range compare|
US9459867B2|2012-03-15|2016-10-04|International Business Machines Corporation|Instruction to load data up to a specified memory boundary indicated by the instruction|
US9454367B2|2012-03-15|2016-09-27|International Business Machines Corporation|Finding the length of a set of character data having a termination character|
US9710266B2|2012-03-15|2017-07-18|International Business Machines Corporation|Instruction to compute the distance to a specified memory boundary|
US9454366B2|2012-03-15|2016-09-27|International Business Machines Corporation|Copying character data having a termination character from one memory location to another|
US9588762B2|2012-03-15|2017-03-07|International Business Machines Corporation|Vector find element not equal instruction|
US9268566B2|2012-03-15|2016-02-23|International Business Machines Corporation|Character data match determination by loading registers at most up to memory block boundary and comparing|
US9280347B2|2012-03-15|2016-03-08|International Business Machines Corporation|Transforming non-contiguous instruction specifiers to contiguous instruction specifiers|
US9715383B2|2012-03-15|2017-07-25|International Business Machines Corporation|Vector find element equal instruction|
US9923840B2|2012-08-20|2018-03-20|Donald Kevin Cameron|Improving performance and security of multi-processor systems by moving thread execution between processors based on data location|
US9513906B2|2013-01-23|2016-12-06|International Business Machines Corporation|Vector checksum instruction|
US9471308B2|2013-01-23|2016-10-18|International Business Machines Corporation|Vector floating point test data class immediate instruction|
US9715385B2|2013-01-23|2017-07-25|International Business Machines Corporation|Vector exception code|
US9804840B2|2013-01-23|2017-10-31|International Business Machines Corporation|Vector Galois Field Multiply Sum and Accumulate instruction|
US9823924B2|2013-01-23|2017-11-21|International Business Machines Corporation|Vector element rotate and insert under mask instruction|
WO2014151652A1|2013-03-15|2014-09-25|Soft Machines Inc|Method and apparatus to allow early dependency resolution and data forwarding in a microprocessor|
EP2972798B1|2013-03-15|2020-06-17|Intel Corporation|Method and apparatus for guest return address stack emulation supporting speculation|
US20140281398A1|2013-03-16|2014-09-18|William C. Rash|Instruction emulation processors, methods, and systems|
US9703562B2|2013-03-16|2017-07-11|Intel Corporation|Instruction emulation processors, methods, and systems|
US10120681B2|2014-03-14|2018-11-06|International Business Machines Corporation|Compare and delay instructions|
US9558032B2|2014-03-14|2017-01-31|International Business Machines Corporation|Conditional instruction end operation|
US9454370B2|2014-03-14|2016-09-27|International Business Machines Corporation|Conditional transaction end instruction|
WO2016014867A1|2014-07-25|2016-01-28|Soft Machines, Inc.|Using a conversion look aside buffer to implement an instruction set agnostic runtime architecture|
US10353680B2|2014-07-25|2019-07-16|Intel Corporation|System converter that implements a run ahead run time guest instruction conversion/decoding process and a prefetching process where guest code is pre-fetched from the target of guest branches in an instruction sequence|
US9792098B2|2015-03-25|2017-10-17|International Business Machines Corporation|Unaligned instruction relocation|
US10055208B2|2015-08-09|2018-08-21|Oracle International Corporation|Extending a virtual machine instruction set architecture|
US11263131B2|2020-04-08|2022-03-01|Alibaba Group Holding Limited|System and method for allocating memory space|
法律状态:
2018-12-04| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|
2019-12-10| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2021-07-13| B06A| Patent application procedure suspended [chapter 6.1 patent gazette]|
2021-11-30| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
2022-01-04| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 15/11/2012, OBSERVADAS AS CONDICOES LEGAIS. |
优先权:
申请号 | 申请日 | 专利标题
US13/421,657|2012-03-15|
US13/421,657|US9280347B2|2012-03-15|2012-03-15|Transforming non-contiguous instruction specifiers to contiguous instruction specifiers|
PCT/IB2012/056436|WO2013136144A1|2012-03-15|2012-11-15|Transforming non-contiguous instruction specifiers to contiguous instruction specifiers|
[返回顶部]