巴西专利BR112014015051B1 method and system for using memory free hints within a computer system

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
SYSTEM AND METHOD FOR THE INTELLIGENT RELEASE OF DATA FROM A PROCESSOR TO A MEMORY SUBSYSTEM. A system and method for intelligently releasing data from a processor cache are described. For example, a system in accordance with an embodiment of the invention comprises: a processor having a cache from which data is downloaded, data associated with a given range of system addresses; and a PCM memory controller for managing access to data stored in a PCM memory device corresponding to a given range of system addresses; the processor determining whether memory free hints are enabled for the specified system address range, where if memory free hints are enabled for the specified system address range then the processor sends a memory free hint to a PCM memory controller of the PCM memory device, and where the PCM memory controller uses the memory free hint to determine whether downloaded data should be saved to the PCM memory device.
公开号:BR112014015051B1
申请号:R112014015051-6
申请日:2011-12-21
公开日:2021-05-25
发明作者:Murugasamy K. Nachimuthu；Mohan J. Kumar
申请人:Intel Corporation；
IPC主号:

专利说明:

HISTORIC Field of Invention
[001] This invention relates generally to the field of computer systems. More particularly, the invention relates to an apparatus and method for implementing a multi-level memory hierarchy. Description of Related Art Current Memory and Storage Settings
[002] One of the limiting factors for computer innovation today is memory and storage technology. In conventional computer systems, system memory (also known as main memory, primary memory, executable memory) is usually implemented by dynamic random access memory (DRAM). DRAM-based memory consumes power even when memory reading or writing does not occur because it must constantly recharge the internal capacitors. DRAM-based memory is volatile, meaning that data stored in DRAM memory is lost when power is removed. Conventional computer systems also feature multiple levels of cache to improve performance. A cache is high-speed memory positioned between the processor and system memory to serve memory access requests faster than system memory would. These caches are usually implemented with static random access memory (SRAM). Cache management protocols can be used to ensure that the most accessed data and instructions are stored within one of the cache levels, thus reducing the number of memory access operations and improving performance.
[003] With respect to mass storage (also known as secondary storage or disk storage), conventional mass storage devices typically include magnetic media (eg hard disk drives), optical media (eg hard disk drive). compact disk (CD), versatile digital disk drive (DVD), etc.), holographic media, and/or mass storage flash memory (eg solid state drives (SSDs), removable flash drives, etc.) . In general, these storage devices are considered input/output (I/O) devices as they are accessed by the processor through multiple I/O adapters that implement multiple I/O protocols. These I/O adapters and I/O protocols consume a significant amount of power and can have a significant impact on the array area and form factor of the platform. Portable or mobile devices (eg laptops, netbooks, tablet computers, PDAs, portable media players, handheld gaming devices, digital cameras, cell phones, smartphones, feature phones, etc.) that have limited battery life when not connected to a permanent power supply can include removable mass storage devices (eg, Embedded Multimedia Card (eMMC), Secure Digital (SD) card), which are typically coupled to the processor via low-power interconnects and I/O controllers in order to meet both active and inactive energy budgets.
[004] With respect to memory firmware (such as boot memory (also known as BIOS flash)), a conventional computer system generally uses flash memory devices to store persistent system information that is read many times, but rarely (if ever) recorded. For example, the initial instructions executed by a processor to initialize the main system components during the boot process (BIOS images - Basic Input and Output System) are normally stored in a flash memory device. Flash memory devices that are currently available on the market are generally limited in speed (eg 50 MHz). This speed is further reduced by the overhead of read protocols (eg 2.5 MHz). In order to speed up BIOS execution speed, conventional processors typically cache a portion of BIOS code during the PEI (Pre- Extensible Firmware Interface) phase of the boot process. Processor cache size places a restriction on the size of the BIOS code used in the PEI phase (also known as the "BIOS code PEI"). Phase Change Memory (PCM) and Related Technologies
[005] PCM (phase change memory), sometimes also called phase change random access memory (PRAM or PCRAM), PCME, Ovonic Unified Memory, or Chalcogenide RAM (CRAM), is a type of non-volatile computer memory that explores the unusual behavior of chalcogenide glass. As a result of the heat produced by the passage of an electric current, chalcogenide glass can be converted between two states: crystalline and amorphous. Recent versions of PCM can achieve two additional distinct states.
[006] PCM provides superior performance to flash as the PCM memory element can be switched more quickly, recording (changing individual bits to 1 or 0) can be done without the need to first erase an entire block of cells and any degradation of aggravations is slower (a PCM device can survive approximately 100 million write cycles; PCM degradation is caused by thermal expansion during programming, metal (and other materials) migration, migration, and others mechanisms). BRIEF DESCRIPTION OF THE DRAWINGS
[007] The following description and drawings are used to illustrate embodiments of the invention. In the drawings:
[008] FIG. 1 illustrates a system cache and memory arrangement in accordance with embodiments of the invention;
[009] FIG. 2 illustrates a hierarchy of memory and storage used in embodiments of the invention;
[0010] FIG. 3 illustrates a computer system in which modalities can be implemented;
[0011] FIG. 4A illustrates a first system architecture that includes the PCM in accordance with embodiments of the invention;
[0012] FIG. 4B illustrates a second system architecture that includes the PCM in accordance with embodiments of the invention;
[0013] FIG. 4C illustrates a third system architecture that includes the PCM in accordance with embodiments of the invention;
[0014] FIG. 4D illustrates a fourth system architecture that includes the PCM in accordance with embodiments of the invention;
[0015] FIG. 4E illustrates a fifth system architecture that includes the PCM in accordance with embodiments of the invention;
[0016] FIG. 4F illustrates a sixth system architecture that includes the PCM in accordance with embodiments of the invention;
[0017] FIG. 4G illustrates a seventh system architecture that includes the PCM in accordance with embodiments of the invention;
[0018] FIG. 4H illustrates an eighth system architecture that includes the PCM in accordance with embodiments of the invention;
[0019] FIG. 4I illustrates a ninth system architecture that includes the PCM in accordance with embodiments of the invention;
[0020] FIG. 4J illustrates a tenth system architecture that includes the PCM in accordance with embodiments of the invention;
[0021] FIG. 4K illustrates an eleventh system architecture that includes the PCM in accordance with embodiments of the invention;
[0022] FIG. 4L illustrates a twelfth system architecture that includes the PCM in accordance with embodiments of the invention; and
[0023] FIG. 4M illustrates a thirteenth system architecture that includes the PCM in accordance with embodiments of the invention.
[0024] FIG. 5A illustrates an embodiment of a system architecture that includes a "near" volatile memory and a "far" non-volatile memory;
[0025] FIG. 5B illustrates an embodiment of a memory-side cache (MSC);
[0026] FIG. 5C illustrates another embodiment of a memory-side cache (MSC), which includes an integrated markup cache and ECC generation/control logic;
[0027] FIG. 5D illustrates an embodiment of an example tag cache and example ECC generation/control unit;
[0028] FIG. 5E illustrates an embodiment of a DIMM PCM with a PCM controller;
[0029] FIG. 6A illustrates MCE controllers and dedicated caches for certain specified system physical address (SPA) ranges in accordance with an embodiment of the invention;
[0030] FIG. 6B illustrates an exemplary mapping between a system memory map, a "near" memory address map, and a PCM address map in accordance with an embodiment of the invention;
[0031] FIG. 6C illustrates an exemplary mapping between a system physical address (SPA) and a PCM physical device address (PDA) or a "near" memory address (NMA) in accordance with an embodiment of the invention;
[0032] FIG. 6D illustrates the interleaving between memory pages within a system physical address space (SPA) and memory channel address space (MCA) in accordance with an embodiment of the invention;
[0033] FIG. 7 illustrates an exemplary multiprocessor architecture in which embodiments of the invention can be implemented.
[0034] FIG. 8 illustrates a memory map of the system, in accordance with some embodiments of the invention.
[0035] FIG. 9 illustrates an embodiment of a memory interval register (MRR) containing release hint data.
[0036] FIG. 10 illustrates an embodiment of a PCMS memory controller.
[0037] FIG. 11 illustrates an embodiment of an intelligent data release method to a PCMS device.
[0038] FIG. 12 illustrates a method according to an embodiment of the invention. DETAILED DESCRIPTION
[0039] In the following description, several specific details, such as logic modalities, opcodes (operation codes), means of specifying operands, modalities of partitioning/sharing/duplication of resources, types and interrelationships of the components of the system, and logical partitioning/integration options are established in order to provide a more complete understanding of the present invention. However, those skilled in the art will find that the invention can be practiced without all the specific details. In other cases, control structures, gate level circuits and complete software instruction sequences have not been shown in detail so as not to obscure the invention. Those less skilled in the art will be able to implement the appropriate functionality without undue experimentation, based on the included descriptions.
[0040] References in the specification to "a modality", "an example of a modality", etc., indicate that the described modality may include a certain feature, structure, or feature, but each modality need not necessarily include the feature, structure, or feature in particular. Also, these phrases do not necessarily refer to the same modality. In addition, when a particular feature, structure, or feature is described in connection with a modality, it is claimed that it is within the knowledge of a person skilled in the art to effect such feature, structure, or feature related to other modalities that may or may not be explicitly described here.
[0041] In the following description and in the claims, the terms "coupled" and "connected", together with their derivatives, may be used. It should be understood that these terms are not intended to be synonymous with each other. "Coupled" is used to indicate that two or more elements, which may or may not be in direct physical or electrical contact with each other, cooperate or interact with each other. "Connected" is used to indicate the establishment of communication between two or more elements that are coupled with each other.
[0042] Text between square brackets and blocks with dashed borders (e.g., large dashes, small dashes, dot and dash, dots) are sometimes used here to illustrate optional operations/components that add additional features to the embodiments of the invention. However, such notation should not be taken to mean that these are the only options or optional features/components, and/or that solid-edged blocks are not optional in certain embodiments of the invention. INTRODUCTION
[0043] Memory capacity and performance requirements continue to increase with an increase in the number of processor cores and new usage models such as virtualization. Furthermore, memory power and costs have become a significant component of power and overall cost, respectively, of electronic systems.
[0044] Some embodiments of the invention solve the above challenges by intelligently subdividing performance requirements and capacity requirements between memory technologies. The focus of this approach is to provide performance with a relatively small amount of relatively higher speed memory such as DRAM while implementing most of the system memory using significantly cheaper and denser nonvolatile random access memory (NVRAM). The embodiments of the invention described below define platform configurations that allow hierarchical memory subsystem organizations for the use of NVRAM. The use of NVRAM in the memory hierarchy also allows for new uses, such as the expanded boot space and mass storage implementations, as described in detail below.
[0045] FIG. 1 illustrates a system cache and memory arrangement in accordance with embodiments of the invention. Specifically, Figure 1 shows a memory hierarchy, including a set of internal processor caches 120, "near memory" acting as a memory cache 121, which can include both the internal cache(s). s) 106 and external caches 107-109, and "far memory" 122. A particular type of memory that can be used for "far memory" in some embodiments of the invention is non-volatile random access memory ("NVRAM"). As such, an overview of NVRAM is provided below, followed by a general description of "far" memory and "near" memory. Nonvolatile Random Access Memory ("NVRAM")
[0046] There are many possible technology options for NVRAM, including PCM, Phase Change Memory and Switch (PCMS) (the latter being a more specific application than the former), byte addressable persistent memory (BPRAM), universal memory, Ge2Sb2Te5, cell of programmable metallization (PMC), resistive memory (RRAM), RESET cell (amorphous), SET cell (crystalline), PCME, Ovshinsky memory, ferroelectric memory (also known as polymer and poly(N-vinylcarbazol) memory), memory ferromagnetic (also known as Spintronics, SPRAM (spin-transfer torque RAM), STRAM (spin tunneling RAM), magnetoresistive memory, magnetic memory, magnetic random access memory (MRAM) and Semiconductor-oxide-nitride-oxide-semiconductor (SONOS) , also known as dielectric memory).
[0047] For use in the memory hierarchy described in this application, NVRAM has the following characteristics:
[0048] It retains its contents even if the power is cut, similar to FLASH memory used in solid state disks (SSD), and unlike SRAM and DRAM, which are volatile;
[0049] lower power consumption when inactive than volatile memories such as SRAM and DRAM;
[0050] Random access similar to SRAM and DRAM (also known as randomly addressable);
[0051] rewritable and erasable at a lower level of granularity (eg byte level) than FLASH found on SSDs (which can only be rewritten and erased one "block" at a time - minimally 64 Kbytes of size for NOR FLASH and 16 Kbytes for NAND FLASH);
[0052] usable as a system memory and allocated to all or part of the system memory address space;
[0053] capable of being coupled to the processor through a bus using a transaction protocol (a protocol that supports transaction identifiers (IDs) to distinguish the different operations, so that these operations can be completed in order) and to allow access to a level of granularity small enough to support the operation of NVRAM as system memory (eg cache row size as 64 or 128 bytes). For example, the bus can be a memory bus (eg a DDR bus such as DDR3, DDR4, etc.) on which a transaction protocol runs, as opposed to the non-transactional protocol that is commonly used. As another example, the bus can be a transactional protocol (a native transactional protocol), such as a PCI Express bus (PCIE), Desktop Management Interface (DMI) bus, or any other type. bus using a transactional protocol and a very small transactional payload size (eg cache line size like 64 or 128 bytes); and
[0054] one or more of the following:
[0055] a faster write speed than non-volatile memory/storage technologies such as FLASH;
[0056] very high read speed (faster than FLASH and close to or equivalent to DRAM read speeds);
[0057] directly writable (instead of requiring erasure (1s replacement) before writing data as FLASH memory used in SSDs); and/or
[0058] orders of magnitude (eg 2 or 3) higher write resistance before failure (more boot ROM and FLASH used on SSDs).
[0059] As mentioned above, in contrast to FLASH memory, which must be rewritten and erased one complete "block" at a time, the level of granularity at which NVRAM is accessed in any given application may depend on the particular memory controller and from the particular memory bus, or another type of bus to which NVRAM is coupled. For example, in some implementations where NVRAM is used as system memory, it can be accessed at the granularity of a cache line (for example, a 64-byte or 128-byte cache line), despite an inherent ability to be accessed at one-byte granularity because the cache line is the level at which the memory subsystem accesses memory. Thus, when NVRAM is deployed within a memory subsystem, it can be accessed at the same level of granularity as DRAM (eg, "near" memory) used in the same memory subsystem. Even so, the degree of granularity of access to NVRAM memory by the memory controller and memory bus or other type of bus is smaller than the block size used by Flash and the access size of the controller and the memory bus. I/O subsystem.
[0060] NVRAM can also incorporate smoothing algorithms to compensate for the fact that "far" memory-level storage cells start to wear out after a number of write accesses, especially where a significant number of writes occur, such as in a system memory implementation. Since high cycle count blocks are more likely to wear out in this way, usage leveling spreads and writes through the "far" memory cells, swapping the addresses of high cycle count blocks with low blocks cycle count. Note that most address switching is normally transparent to application programs because it is handled by hardware, lower-level software (for example, a low-level driver or operating system), or a combination of the two. Far memory
[0061] Far 122 memory of some embodiments of the invention is implemented with NVRAM, but is not necessarily limited to any particular memory technology. "far" memory 122 is distinguishable from other memory/data and instruction storage technologies in terms of its characteristics and/or its application in the memory/storage hierarchy. For example, memory "far" 122 is very different from:
[0062] static random access memory (SRAM), which can be used for level 0 and level 1 internal processor caches 101a-b, 102a-b, 103a-b and 104a-b dedicated to each of the processor cores 101-104, respectively, and low-level cache (LLC) 105 shared by processor cores;
[0063] dynamic random access memory (DRAM), configured as an internal 106 cache for the processor 100 (e.g., in the same mold as the processor 100) and/or configured as one or more external 107-109 caches for the processor (for example, in the same package or in a different package of processor 100); and
[0064] FLASH memory/magnetic disk/optical disk applied as mass storage (not shown); and
[0065] memory such as FLASH memory or other read-only memory (ROM) applied as firmware memory (which may refer to boot ROM, BIOS Flash, and/or TPM Flash).(not shown).
[0066] The "far" memory 122 can be used as a data and instruction store that is directly addressable by a processor 100 and capable of sufficiently keeping pace with processor 100, in contrast to FLASH/magnetic disk/optical disk applied as mass storage. In addition, as discussed above and described in detail below, "far" memory 122 can be placed on a memory bus and can communicate directly with a memory controller which, in turn, communicates directly with processor 100.
[0067] Far 122 memory can be combined with other instruction and data storage technologies (eg DRAM) to form hybrid memories (also known as PCM and colocation DRAM; first level memory and second memory level; FLAM (FLASH and DRAM)). Note that at least some of the above technologies, including PCM/PCMS, can be used for mass storage instead of or in addition to system memory, and need not be randomly accessible, byte-addressable, or directly addressable by the processor when applied in this way.
[0068] For convenience of explanation, most of the remainder of the application refers to "NVRAM" or, more specifically, "PCM," or "PCMS" as the technology selection for "far" memory 22. As such , the terms NVRAM, PCM, PCMS, and "far" memory may be used interchangeably in the following discussion. However, it should be noted, as discussed above, that different technologies can also be used for "far" memory. Also, NVRAM is not limited to use as "far" memory. "near" memory
[0069] The "Memory Near" 121 is an intermediate level of memory configured in front of a far memory 122 that has lower read/write access latency compared to far memory and/or more symmetric read/write access latency (ie having read times that are roughly equivalent to write times). In some embodiments, "near" memory 121 has significantly lower write latency than "far"122 memory but similar read latency (eg, slightly less or equal); for example, "near" memory 121 may be volatile memory such as volatile random access memory (VRAM) and may include DRAM or other high-speed capacitor-based memory. Note, however, that the underlying principles of the invention are not limited to these types of specific memory. In addition, "near" memory 121 may have a relatively lower density and/or may be more expensive to manufacture than far memory 122.
[0070] In one embodiment, the "near" memory 121 is configured between the far memory 122 and the internal processor caches 120. In some of the embodiments described below, the "near" memory 121 is configured as one or more of the processor caches. memory side (MSCs) 107-109 to mask performance and/or far memory usage limitations, including, for example, read/write latency limitations and memory degradation limitations. In these implementations, the combination of MSC 107-109 and "far" 122 memory operates at a performance level that approximates, is equivalent to, or better than, a system that uses only DRAM as system memory. As discussed in detail below, although shown as a "cache" in Figure 1, "near" memory 121 can include modes in which it performs other functions, either in addition to or in place of performing the cache function.
[0071] The "near" memory 121 can be located on the processor die (as cache(s) 106) and/or located external to the processor die (as cache(s) 107-109) ( for example, on a separate chip located on top of the CPU pack, located outside the CPU pack with a high-bandwidth link to the CPU pack, for example, on a dual in-line memory module (DIMM), an elevation/mezzanine, or a computer motherboard). "Near" memory 121 may be coupled in communication with processor 100 using broadband single or multiple links, such as DDR or other high bandwidth transactional links (as described in detail below). AN EXAMPLE OF SYSTEM MEMORY ALLOCATION SCHEME
[0072] Figure 1 illustrates how the different levels of caches 101109 are configured in relation to a system physical address space (SPA) 116-119 in the embodiments of the invention. As mentioned, this modality comprises a processor 100 with one or more 101-104 cores, with each core having its own dedicated top-level cache (L0) 101a-104a and mid-level cache (MLC) (L1) cache 101b- 104b. Processor 100 also includes a shared LLC 105. The operation of these different cache levels is well understood and will not be described in detail here.
[0073] Caches 107-109 illustrated in Figure 1 can be dedicated to a particular system memory address range or to a set of non-contiguous address ranges. For example, cache 107 is dedicated to act as an MSC for system memory address range #1 116 and caches 108 and 109 are dedicated to act as MSCs for non-overlapping portions of system memory address ranges No. 2 117 and No. 3 118. This last implementation can be used for systems where the SPA space used by processor 100 is interleaved into an address space used by caches 107-109 (for example, when configured as MSCs ). In some embodiments, this last address space is considered a memory channel address space (MCA). In one embodiment, internal caches 101a-106 perform caching operations for the entire SPA space.
[0074] System memory as used herein is memory that is visible and/or directly addressable by software running on processor 100; while 101a-109 cache memories can operate transparently to software in the sense that they do not form a directly addressable part of the system's address space, but the cores can also support the execution of instructions to allow the software to provide some control (configuration, policies, suggestions, etc.) for part or all of the caches. The subdivision of system memory into regions 116-119 can be performed manually, as part of a system configuration process (for example, by a system designer) and/or can be performed automatically by software.
[0075] In one embodiment, system memory regions 116-119 are implemented using far memory (eg PCM) and, in some embodiments, "near" memory configured as system memory. System memory address range #4 represents an address range that is implemented using higher speed memory, such as DRAM, which can be near memory configured in a system memory mode (as opposed to a memory mode). caching).
[0076] Figure 2 illustrates a memory/storage hierarchy 140 and different configurable modes of operation for "near" memory 144 and NVRAM according to embodiments of the invention. The memory/storage hierarchy 140 has several levels, including (1) a cache level 150, which can include 150A processor caches (for example, 101A-105 caches in Figure 1) and, optionally, "near" memory as a cache. for far 150B memory (in certain modes of operation, as described herein), (2) a system memory tier 151, which may include far 151B memory (e.g., NVRAM, such as PCM), when near memory is present (or just NVRAM as 174 system memory when near memory is not present) and optionally "near" memory operating as 151A system memory (in certain modes of operation as described here), (3) one level of storage mass 152 which may include flash/magnetic/mass storage 152B and/or mass storage of NVRAM 152A (e.g., a portion of NVRAM 142); and (4) a firmware memory level 153, which can include BIOS 170 flash and/or BIOS 172 NVRAM and, optionally, trusted platform module (TPM) NVRAM 173.
[0077] As indicated, "near" memory 144 can be implemented to operate in several different modes, including: a first mode in which it operates as cache for far memory (near memory as cache for FM 150B); a second mode in which it operates as 151A system memory and occupies a portion of SPA space (sometimes referred to as "direct access" mode of near memory); and one or more additional modes of operation, such as scratch memory 192 or as temporary recording storage 193. In some embodiments of the invention, near memory is partitionable, where each partition can operate simultaneously in one of the different supported modes ; and different modalities can support configuration of partitions (eg sizes, modes) by hardware (eg fuses, pins), firmware, and/or software (eg via a set of programmable range registers within the controller MSC 124 within which, for example, different binary codes can be stored to identify each mode and partition).
[0078] The system A 190 address space in Figure 2 is used to illustrate the operation when the memory is configured as an MSC for far 150B memory. In this configuration, system A address space 190 represents the entire system address space (and system B address space 191 does not exist). Alternatively, system address space B 191 is used to show an implementation when all or part of near memory is assigned to a part of the system address space. In this embodiment, system B address space 191 represents the range of system address space assigned to near memory 151A and system A address space 190 represents the range of system address space assigned to NVRAM 174.
[0079] In addition, by acting as a cache for the 150B far memory, the 144 near memory can operate in various submodes under the control of the MSC 124 controller. In each of these modes, the near memory address space (NMA) is transparent to software in the sense that near memory does not form a directly addressable part of the system address space. These modes include, but are not limited to the following: Write-back caching mode: In this mode, all or parts of near memory acting as an FM 150B cache is used as a cache for far NVRAM (FM) 151B memory. While in write-back mode, each write operation is initially directed to near memory as cache for FM 150B (assuming the cache line to which the write is directed is present in the cache). The corresponding write operation is performed to update FM from NVRAM 151B only when the cache line in near memory as cache for FM 150B has to be replaced by another cache line (in contrast to the write-through mode described below, where each write operation is immediately propagated to the FM of NVRAM 151B).
[0080] Near memory bypass mode: In this mode, all reads and writes bypass NM acting as FM 150B cache and go directly to FM NVRAM 151B. This mode can be used, for example, when an application is not cache-friendly or requires data to be compromised with persistence at the granularity of a cache row. In one embodiment, the storage performed by processor 150A and NM caches acting as an FM 150B cache operate independently of each other. Consequently, data can be cached in the NM acting as the FM 150B cache which is not cached in the 150A processor caches (and which, in some cases, may not be allowed to be cached in the 150A processor cache ) and vice versa. Thus, certain data that might be designated as "uncacheable" in the processor caches can be cached within the NM acting as an FM 150B cache.
[0081] Near Memory Read-Cache Write Bypass Mode: This is a variation of the above mode, where reading storage of persistent data from FM NVRAM FM 151B is allowed (ie, persistent data is cached in near memory as far 150B memory cache for read-only operations). This is useful when most persistent data is "read-only" and the application is cache friendly.
[0082] Near Memory Read-Cache Write-Through Mode: This is a variation of the "near memory read-cache write bypass" mode, where, in addition to reading the storage, the "write-hits" are also cached. Each recording to near memory as cache for FM 150B causes a recording for FM 151B FM. Thus, due to the write-through nature of the cache, cache line persistence is still guaranteed.
[0083] When acting in direct near memory access mode, all or part of near memory such as 151A system memory is directly visible to the software and is part of the SPA space. This memory can be completely under the control of the software. Such a scheme can create a non-uniform memory address (NUMA) memory domain for software, where it gets greater performance from near 144 memory over NVRAM 174 system memory. By way of example, not limitation, such usage it can be used for certain high performance computing and graphics (HPC) applications that require very fast access to certain data structures.
[0084] In an alternative embodiment, the near memory direct access mode is implemented by "pinning" certain cache lines into near memory (ie, cache lines that contain data that is also simultaneously stored in NVRAM 142). Such pinning can be done effectively in larger, multi-way, and set-associative caches.
[0085] Figure 2 also illustrates that a portion of NVRAM 142 can be used as firmware memory. For example, BIOS NVRAM part 172 can be used to store BIOS images (instead of or in addition to storing BIOS information in BIOS flash 170). The BIOS NVRAM part 172 may be a part of the SPA space and is directly addressable by software running on the processor cores 101-104, while the flash BIOS 170 is addressable through the I/O subsystem 115. As another example, a portion of Trusted Platform Module (TPM) 173 NVRAM can be used to protect sensitive system information (eg encryption keys).
[0086] Thus, as indicated, NVRAM 142 can be implemented to operate in a variety of different modes, including as "far" 151B memory (for example, when "near" memory 144 is present/operational, is it acting as cache for FM via an MSC 124 control or not (accessed directly after the 101A - 105 caches and without an MSC 124 control)); NVRAM 174 system memory only (not as far memory as there is no near present/operational memory, and accessed without MSC 124 control); NVRAM 152A mass storage; BIOS NVRAM 172; and TPM NVRAM 173. Although different modalities may specify NVRAM modes in different ways, Figure 3 describes the use of a 333 decoding table.
[0087] Figure 3 illustrates an exemplary computer system 300 in which embodiments of the invention can be implemented. Computer system 300 includes a processor 310 and memory/storage subsystem 380 with an NVRAM 142 used for both system memory, mass storage, and optional firmware memory. In one embodiment, NVRAM 142 comprises the system memory and storage hierarchy used by computer system 300 to store persistent and non-persistent data, instructions, states, and other information. As discussed earlier, NVRAM 142 can be configured to implement the functions of a typical hierarchy of memory and system storage, mass storage, and firmware memory, TPM memory, and the like. In the mode of Figure 3, NVRAM 142 is divided into FM 151B, NVRAM 152A mass storage, BIOS NVRAM 173, and NVRAM TMP 173. Storage hierarchies with different functions are also covered and the application of NVRAM 142 is not limited to the functions described above.
[0088] By way of example, the operation while the near memory as FM 150B cache is in write-back caching is described. In one embodiment, while the near memory as cache for FM 150B is in the above mentioned write-back caching mode, a read operation will first arrive at the MSC 124 controller which will perform a look-up to determine if the requested data is present in memory. near acting as a cache for FM 150B (eg using a 342 tag cache). If present, it will return data to the CPU, core 101-104, or requesting I/O device via I/O subsystem 115. If the data is not present, the MSC 124 controller will send the request along with the system memory address for a 332 NVRAM controller. The 332 NVRAM controller will use the 333 decoding table to translate the system memory address to a physical device address (PDA) of NVRAM and direct the read operation for this far 151B memory region. In one embodiment, decode table 333 includes an address direction table (AIT) component, which the NVRAM controller 332 uses to translate between system memory addresses and NVRAM PDAs. In one embodiment, the AIT is updated as part of the usage leveling algorithm implemented to distribute memory access operations and thus reduce FM usage of NVRAM 151B. Alternatively, the AIT can be a separate table stored within the NVRAM 332 controller.
[0089] Upon receiving the requested data from FM NVRAM 151B, the NVRAM 332 controller will return the requested data to the MSC 124 controller which will store the data in the MSC near memory acting as an FM 150B cache and will also send the data to the requesting processor core 101-104, or I/O device through I/O subsystem 115. Subsequent requests for this data can be serviced directly from near memory acting as an FM 150B cache until replaced by other data from the FM from NVRAM.
[0090] As mentioned, in one embodiment, a memory write operation also goes first to the MSC controller 124, which writes to the MSC near memory acting as an FM 150B cache. In write-back caching mode, data cannot be sent directly to the FM of NVRAM 151B when a write operation is received. For example, data can be sent to FM from NVRAM 151B only when the MSC near memory acting as the FM 150B cache in which the data is stored must be reused for storing data to a different system memory address. When this happens, the MSC 124 controller notices that the data is not current in the NVRAM 151B FM and thus retrieves it from near memory acting as the FM 150B cache and sends it to the NVRAM 332 controller. The NVRAM controller 332 searches the PDA for the system memory address and then writes the data to the FM of NVRAM 51B.
[0091] In Figure 3, NVRAM controller 332 is shown connected to FM 151B, NVRAM mass storage 152A, BIOS NVRAM 172 via three separate lines. This does not necessarily mean that there are three separate physical buses or communication channels that connect the NVRAM controller 332 to these parts of NVRAM 142. Instead, in some embodiments, a common memory bus or other type of bus (such as those described below referring to Figures 4A-M) is used to communicatively couple the NVRAM 332 controller to the FM 151B, NVRAM 152A mass storage, and BIOS 172 NVRAM. For example, in one modality, the three lines in Figure 3 represent a bus, such as a memory bus (eg a DDR3, DDR4 bus, etc.), over which the 332 NVRAM controller implements a transactional protocol to communicate with the 142 NVRAM. The 332 NVRAM controller can also communicate with NVRAM 142 over a bus that supports a native transactional protocol, such as a PCI Express bus, DMI (desktop management interface) bus, or any other ro bus type using a transactional protocol and a sufficiently small transactional payload size (eg cache line size such as 64 or 128 bytes).
[0092] In one embodiment, the computer system 300 includes the integrated memory controller (IMC) 331, which performs access control to the central memory of the processor 310, which is coupled to: 1) a memory-side cache controller. memory (MSC) 124 to control access to near (NM) memory acting from far memory cache 150B; and 2) an NVRAM 332 controller to control access to NVRAM 142. Although illustrated as separate units in Figure 3, the MSC 124 controller and the NVRAM 332 controller can logically form part of the IMC 331.
[0093] In the illustrated mode, the MSC controller 124 includes a set of range registers 336 that specify the operating mode in use for the NM acting as far 150B memory cache (e.g., write-back mode caching, mode near memory bypass, etc., described above). In the illustrated embodiment, DRAM 144 is used as the memory technology for the NM acting as far 150B memory cache. In response to a memory access request, the MSC controller 124 can determine (depending on the operating mode specified in the interval register 336) whether the request can be serviced from the NM acting as the FM 150B cache, or whether the The request must be sent to the NVRAM controller 332, which can then fulfill the request from the far (FM) 151B memory portion of NVRAM 142.
[0094] In a modality where NVRAM 142 is implemented with PCMS, NVRAM 332 controller is a PCMS controller that performs access with protocols consistent with PCMS technology. As discussed earlier, PCMS memory is inherently capable of being accessed at one-byte granularity. However, the 332 NVRAM controller can access a PCMS 151B-based far memory at a lower level of granularity, such as a cache line (for example, a 64-bit or 128-bit cache line) or any other level of granularity consistent with the memory subsystem subsystem. The underlying principles of the invention are not limited to any specific level of granularity for accessing a PCMS 151B-based far memory. In general, however, when PCMS 151B-based far memory is used to form part of the system address space, the level of granularity will be higher than traditionally used for other non-volatile storage technologies such as FLASH , which can only perform rewrite and erase operations at the "block" level (minimum 64 Kbytes in size for NOR FLASH and 16 Kbytes for NAND FLASH).
[0095] In the illustrated mode, NVRAM 332 controller can read configuration data to establish the modes, sizes, etc., previously described for NVRAM 142 from decoding table 333, or alternatively, can rely on the results decoding tables passed from IMC 331 and I/O subsystem 315. For example, at the time of manufacture or operation, computer system 300 can program decode table 333 to mark different regions of NVRAM 142 as system memory, storage mass exposed via SATA interfaces, mass storage via USB Bulk Only Transport (BOT) interfaces, encrypted storage that supports TPM storage, among others. The means by which access is directed to different partitions of the NVRAM device 142 is through decoding logic. For example, in one modality, the address range of each partition is defined in decoding table 333. In one modality, when the IMC 331 receives an access request, the destination address of the request is decoded to reveal whether the request is directed to memory, NVRAM mass storage, or I/O. If it is a memory request, the IMC 331 and/or the MSC controller 124 further determines from the destination address whether the request is directed to NM as cache for FM 150B or for FM 151B. To access FM 151B, the request is forwarded to NVRAM controller 332. The IMC 331 passes the request to I/O subsystem 115 if this request is directed to I/O (eg I/O devices storage and non-storage). I/O subsystem 115 further decodes the address to determine whether it points to NVRAM 152A mass storage, BIOS NVRAM 172, or other non-storage or storage I/O devices. If this address points to NVRAM 152A mass storage or BIOS172 NVRAM, I/O subsystem 115 forwards the request to NVRAM controller 332. If this address points to NVRAM 173 TMP, the I/O subsystem 115 passes the request to TPM 334 to perform secure access.
[0096] In one embodiment, each request forwarded to the 332 NVRAM controller is accompanied by an attribute (also known as a "transaction type") to indicate the type of access. In one embodiment, the NVRAM controller 332 can emulate the access protocol for the type of access required so that the rest of the platform remains ignorant of the multiple roles played by NVRAM 142 in the memory and storage hierarchy. In alternative embodiments, NVRAM controller 332 can perform memory access to NVRAM 142, regardless of transaction type. It is understood that the decoding path may be different from what is described above. For example, IMC 331 can decode the destination address of an access request and determine if it is directed to NVRAM 142. If it is directed to NVRAM 142, IMC 331 generates an attribute according to the decoding table 333. Based on the attribute, the IMC 331 then forwards the request to the appropriate downstream logic (for example, NVRAM controller 332 and I/O subsystem 315) to perform access to the requested data. In another embodiment, the NVRAM controller 332 can decode the destination address if the corresponding attribute is not passed on from the upstream logic (eg, IMC 331 and I/O subsystem 315). Other decoding paths can be implemented.
[0097] The presence of a new memory architecture, as described here, provides several new possibilities. Although discussed in more detail later, some of these possibilities are briefly highlighted below.
[0098] According to a possible implementation, NVRAM 142 functions as a total replacement or supplement for the traditional DRAM technology in system memory. In one embodiment, NVRAM 142 represents the introduction of a second level system memory (for example, the system memory can be viewed as having a first level system memory comprising near memory as cache 150B (part of the device). DRAM 340) and a second level system memory comprising far (FM) 151B memory (part of NVRAM 142).
[0099] Under some embodiments, NVRAM 142 functions as a total replacement or supplement for the 152B flash/magnetic/optical mass storage. As described above, in some embodiments, even though NVRAM 152A is capable of byte-level addressing, the NVRAM 332 controller can still access NVRAM 152A mass storage in multi-byte blocks, depending on the implementation (for example, 64 Kbytes, 128 Kbytes, etc.). The specific way in which data is accessed from the NVRAM 152A mass storage by the NVRAM 332 controller can be transparent to the software running by the 310 processor. For example, even though the NVRAM 152A mass storage can be accessed from Differently from 152A flash/optical/magnetic mass storage, the operating system can still view NVRAM 152A mass storage as a standard mass storage device (eg, a serial ATA hard drive or other standard type of device). mass storage device).
[00100] In a modality where NVRAM 152A mass storage functions as a total replacement for 152B flash/magnetic/optical mass storage, it is not necessary to use storage drivers to access block addressable storage. . Removing storage driver overhead from storage access can increase access speed and save energy. In alternative embodiments, where it is desirable for NVRAM 152A mass storage to appear to the operating system and/or applications as block accessible and indistinguishable from 152B flash/magnetic/optical mass storage, emulated storage drivers can be used to expose block accessible interfaces (eg Universal Serial Bus (USB) Bulk-Only Transfer (BOT), 1.; Serial Advanced Technology Attachment (SATA), 3.0, and so on) for software to access NVRAM mass storage 152A.
[00101] In one embodiment, NVRAM 142 functions as a total replacement or supplement for firmware memory such as BIOS flash 362 and TPM flash 372 (illustrated with dotted lines in Figure 3 to indicate they are optional). For example, NVRAM 142 may include an NVRAM portion of BIOS 172 to supplement or replace BIOS 362 flash, and it may include an NVRAM portion of TPM 173 to supplement or replace TPM 372 flash. Firmware memory may also store states system persistents used by a TPM 334 to protect sensitive system information (for example, encryption keys). In one embodiment, using NVRAM 142 for firmware memory eliminates the need for third-party flash parts to store code and data that are essential for system operations.
[00102] Continuing then with the discussion of the system of Figure 3, in some embodiments, the computer system architecture 100 may include multiple processors, although, for simplicity, a single processor 310 is illustrated in Figure 3. The processor 310 can be any type of data processor, including a general-purpose or special-purpose central processing unit (CPU), an application-specific integrated circuit (ASIC), or a digital signal processor (DSP). For example, the 310 processor can be a general purpose processor such as a Core™ i3, i5, i7, 2 Duo and Quad, Xeon™ or Itanium™ processor, all of which are available from Intel Corporation of Santa Clara, California Alternatively, the 310 processor may be from another company, such as ARM Holdings, Ltd, of Sunnyvale, CA, MIPS Technologies of Sunnyvale, CA, etc. Processor 310 may be a special purpose processor, such as, for example, a network or communication processor, compression engine, graphics processor, coprocessor, integrated processor, or the like. Processor 310 can be implemented on one or more chips included within one or more packages. Processor 310 can be a part of and/or can be implemented on one or more substrates, using any number of process technologies, such as, for example, BiCMOS, CMOS or NMOS. In the mode shown in Figure 3, the 310 processor has a SOC (system-on-a-chip, or system-on-a-chip) configuration.
[00103] In one embodiment, the processor 310 includes an integrated graphics unit 311 that includes logic to execute graphics commands, such as 3D or 2D graphics commands. While the embodiments of the invention are not limited to a particular integrated graphics unit 311, in one embodiment, the graphics unit 311 is capable of executing industry standard graphics commands such as those specified by the Open GL and/or Direct X APIs ( for example, OpenGL 4.1 and Direct X 11).
[00104] Processor 310 may also include one or more cores 101-104, although a single core is illustrated in Figure 3, again, for clarity. In many embodiments, core(s) 101-104 include internal functional blocks, such as one or more execution units, reform units, a set of general and specific purpose registers, etc. If the core(s) are multi-threaded or hyperthreaded, then each hardware segment can be considered as a "logical" core as well. Kernels 101-104 can be homogeneous or heterogeneous in terms of architecture and/or instruction set. For example, some of the cores might be in order while others are out of order. As another example, two or more of the cores might be able to execute the same instruction set, while others might be able to execute only a subset of the instruction set or a different instruction set.
[00105] Processor 310 may also include one or more caches, such as cache 313 which may be implemented as an SRAM and/or a DRAM. In many embodiments that are not shown, additional caches in addition to cache 313 are implemented so that there are multiple levels of cache between execution units in core(s) 101-104 and memory devices 150B and 151B. For example, the set of shared cache units might include a top-level cache such as a tier 1 (L1) cache, mid-level caches such as tier 2 (L2), tier 3 (L3), tier 4 ( L4), or other cache levels, an LLC and/or different combinations thereof. In different embodiments, the cache 313 can be divided into different shapes and can be of different sizes in different embodiments. For example, cache 313 might be an 8 megabyte (MB) cache, a 16 MB cache, and so on. Furthermore, in different modalities the cache can be a direct mapping cache, a fully associative cache, a pool associative multidirectional cache, or a cache of another mapping type. In other modalities that include multiple cores, cache 313 can include a large portion shared among all cores or can be split into several functional slices separately (eg, one slice for each core). Cache 313 can also include a common portion shared between all cores and various other portions that are functional slices separated by core.
[00106] Processor 310 may also include a source agent 314, which includes such coordination components and operational cores 101-104. The originating agent unit 314 may include, for example, a power control unit (PCU) and a display unit. The PCU may include logic and components necessary for regulating the power state of cores 101-104 and the integrated graphics unit 311. The display unit is for controlling one or more externally connected monitors.
[00107] As mentioned, in some embodiments, the 310 processor includes an integrated memory controller (IMC) 331, near memory cache controller (MSC), and NVRAM controller 332 all of which may be on the same chip as the processor. 310, or on a separate chip and/or package connected to the 310 processor. The DRAM device 144 can be the same chip or a different chip as the IMC 331 and the MSC 124 controller; thus, a single chip can have a 310 processor and 144 DRAM device; one chip may have the processor 310 and another DRAM device 144 and (these chips may be the same or different packages); one chip may have a 101-104 core and another the IMC 331, the MSC 124 controller and DRAM 144 (these chips may be the same or different packages); one chip can have cores 101-104, another the IMC 331 and the controller MSC 124, and another the DRAM 144 (these chips can be the same or different packages); etc.
[00108] In some embodiments, the processor 310 includes an I/O subsystem 115 coupled to the CMI 331. The I/O subsystem 115 allows communication between the processor 310 and the following serial or parallel I/O devices: a or more 336 networks (such as a local area network, wide area network, or the Internet), storage I/O device (such as 152B flash/magnetic/optical mass storage, BIOS flash 362, TPM flash 372), and a or more non-storage 337 I/O devices (such as monitor, keyboard, speaker, and so on). I/O subsystem 115 may include a platform controller hub (PCH) (not shown) which further includes various I/O adapters 338 and other I/O circuits to provide access to I/O devices. storage and non-storages and networks. To achieve this goal, I/O subsystem 115 can have at least one built-in I/O adapter 338 for each I/O protocol used. I/O Subsystem 115 may be on the same chip as processor 310, or on a separate chip and/or package connected to processor 310.
[00109] The 338 I/O adapters translate a host communication protocol used within the 310 processor to a protocol compatible with certain I/O devices. For 152B optical/magnetic/optical mass storage, some of the protocols that the 338 I/O adapter can translate include PCI-Expresss (PCI-E) 3.0; USB, 3.0; SATA, 3.0; Small Computer System Interface (SCSI), Ultra-640; IEEE 1394 "Firewire"; between others. For BIOS flash 362, some of the protocols that I/O adapters can translate include Serial Peripheral Interface (SPI), Microwire, and others. Additionally, there may be one or more wireless protocol I/O adapters. Examples of wireless protocols, among others, are used in personal area networks, such as IEEE 802.15 and Bluetooth, 4.0; wireless local area networks, such as wireless protocols based on IEEE 802.11; and cellular protocols.
[00110] In some embodiments, I/O subsystem 115 is coupled to a TPM 334 control to control access to persistent system states, such as security data, encryption keys, platform configuration information, and so on. In one embodiment, these persistent system states are stored in a TMP 173 NVRAM and accessed via a 332 NVRAM controller.
[00111] In one modality, the TPM 334 is a security microcontroller with cryptographic functionalities. The TPM 334 has a number of reliability-related capabilities; for example, a SEAL capability to ensure that data protected by a TPM is only available to the same TPM. The TPM 334 can protect data and keys (eg secrets) using its encryption capabilities. In one modality, the TPM 334 has a unique, secret RSA key, which allows it to authenticate hardware devices and platforms. For example, TPM 334 can verify that a system that requests access to data stored on computer system 300 is the expected system. The TPM 334 is also capable of providing information on the integrity of the platform (eg computer system 300). This allows an external resource (for example, a server on a network) to determine the reliability of the platform, but it does not prevent the user from accessing the platform.
[00112] In some embodiments, the 315 I/O subsystem also includes a Management Engine (ME) 335, which is a microprocessor that allows a system administrator to monitor, maintain, upgrade, modernize, and repair a 300 computer system. In one embodiment, a system administrator can remotely configure computer system 300 by editing the contents of decoding table 333 through ME 335 via networks 336.
[00113] For convenience of explanation, the remainder of the request sometimes refers to NVRAM 142 as a PCMS device. A PCMS device includes arrays of multilayer PCM cells (stacked vertically) that are non-volatile, have low power consumption, and are modifiable at the bit level. As such, the terms NVRAM device and PCMS device can be used interchangeably in the following discussion. However, it should be noted, as discussed above, that different technologies can also be used in addition to PCMS for NVRAM.
[00114] It should be understood that a computer system may use NVRAM 142 for system memory, mass storage, firmware memory and/or other memory and storage purposes, even if the system processor uses -computer does not have all of the processor 310 components described above or has more components than processor 310.
[00115] In the particular mode represented in Figure 3, the MSC controller 124 and NVRAM controller 332 are located in the same matrix or package (called CPU package), as the processor 310. In other embodiments, the MSC controller 124 and /or NVRAM 332 controller may be located outside the array or outside the CPU package, coupled to the 310 processor or CPU package through a bus, such as a memory bus (such as a DDR bus (for example, a DDR3, DDR4, etc.)), a PCI Express bus, a desktop management interface (DMI) bus, or any other type of bus. EXAMPLE OF PCM BUS AND PACKAGE SETTINGS
[00116] Figures 4A-M illustrate a variety of different implementations, in which the processor, near memory and far memory are configured and packaged differently. In particular, the series of platform memory configurations in Figures 4A-M allow the use of a new non-volatile memory system such as PCM technologies or, more specifically, PCMS technologies.
[00117] Although some of the same numerical designations are used in various numbers in Figures 4A-N, this does not necessarily mean that the structures identified by these numerical designations are always identical. For example, while the same numbers are used to identify an integrated memory controller (IMC) 331 CPU 401 in various figures, these components can be implemented differently in different figures. Some of these differences are not highlighted as they are not pertinent to understanding the underlying principles of the invention.
[00118] Although several different system platform configuration approaches are described below, these approaches fall into two broad categories: split architecture and unified architecture. Briefly, in the split architecture scheme, a memory-side cache controller (MSC) (for example, located in the processor matrix or on a separate die in the CPU package) intercepts all system memory requests. There are two separate interfaces that "flow downstream" of that controller that come out of the CPU package to couple Near Memory and far memory. Each interface is tailored for the specific type of memory and each memory can be scaled independently in terms of performance and capacity.
[00119] In the unified architecture scheme a single memory interface comes out of the processor die or CPU package and all memory requests are sent to this interface. The MSC controller along with the near and far memory subsystems are consolidated into this single interface. This memory interface must be tailored to meet the processor's memory performance requirements and must support at least one transactional protocol out of order, as PCMS devices may not process read requests in order. According to the general categories above, the following specific platform configurations can be employed.
[00120] The modalities described below include various types of buses/channels. The terms "bus" and "channel" are used synonymously here. The number of memory channels per DIMM socket will depend on the particular CPU package used in the computer system (with some CPU packages supporting, for example, three memory channels per socket).
[00121] Furthermore, in the modalities described below that use DRAM, virtually any type of DRAM memory channel can be used, including, by way of example and not limitation, the DDR channels (for example, DDR3, DDR4, DDR5 , etc.) Thus, while DDR is advantageous due to its wide acceptance in the industry, resulting in price point, etc., the underlying principles of the invention are not limited to a particular type of DRAM or volatile memory.
[00122] Figure 4A illustrates a modality of a split architecture that includes one or more DRAM devices 403-406 operating as near memory, acting as cache for FM (ie, MSC) in the 401 CPU package (or in the "die" processor or on a separate die) and one or more NVRAM devices such as PCM memory residing in 450-451 DIMMs acting as far memory. High-bandwidth links 407 in CPU package 401 interconnect multiple DRAM devices or a single DRAM device 403-406 to processor 310 that hosts integrated memory controller (IMC) 331 and MSC controller 124. Although illustrated as separate units in Figures 4A and other figures described below, the MSC controller controller 124 may be integrated with the memory controller 331, in one embodiment.
[00123] 450-451 DIMMs use DDR slots and electrical connections defining a DDR 440 channel with DDR address, data and control lines and voltages (eg the DDR3 or DDR4 standard as defined by the Joint Electron Devices Engineering Council ( JEDEC)). The PCM devices in the 450-451 DIMMs provide the far memory capacity of this split architecture, with the 440 DDR channels to the 401 CPU package capable of carrying both DDR protocol and transactional protocols. In contrast to DDR protocols where the processor 310 or other logic within the CPU package (for example, the IMC 331 or MSC 124 controller) transmits a command and receives an immediate response, the transactional protocol used to communicate with the PCM devices allow the 401 CPU to issue a series of transactions, each identified by a unique transaction ID. The commands are served by a PCM controller at the recipient of the PCM DIMMs, which sends responses back to CPU packet 401, potentially out of order. Processor 310 or other logic within CPU packet 401 identifies each transaction response by its transaction ID, which is sent with the response. The above configuration allows the system to support both DDR DRAM-based DIMMs (using DDR protocols over DDR electrical connections) and PCM-based DIMM configurations (using transactional protocols over the same DDR electrical connections).
[00124] Figure 4B illustrates a split architecture that uses DIMMs based on DDR DRAM 452 coupled through DDR 440 channels to form near memory, which acts as an MSC. The 310 processor hosts the 331 memory controller and the 124 MSC controller. NVRAM devices such as PCM memory devices reside in PCM 453-based DIMMs that use DDR slots and electrical connections on additional DDR 442 channels outside of the 401 CPU package. The 450-451 PCM-based DIMMs provide the far memory capacity of this split architecture, with the DDR 440 channels to the 401 CPU package being based on DDR electrical connections and capable of carrying both DDR protocol and transactional protocols. This allows the system to be configured with numerous DDR DRAM 452 DIMMS (eg DDR4 DIMMS) and 453 PCM DIMMs to achieve desired capacity and/or performance points.
[00125] Figure 4C illustrates a split architecture that hosts near 403-406 memory acting as a memory-side cache (MSC) on top of the 401 CPU package (either on the processor die or on a separate die). High-bandwidth links 407 in the CPU package are used to interconnect multiple DRAM devices or single DRAM 403-406 to the processor 310, which hosts the memory controller 331 and the MSC controller 124, as defined by the split architecture. NVRAM, like PCM memory devices, resides on PCI Express cards or 455 risers that use PCI Express electrical connections and the PCI Express protocol or a different transactional protocol over the 454 PCI Express bus. PCM devices on PCI Express cards or 455 risers provides the far memory capacity of this split architecture.
[00126] Figure 4D illustrates a split architecture that uses DIMMs based on DDR DRAM 452 and DDR 440 channels to form near memory, which acts as an MSC. The 310 processor hosts the 331 memory controller and the 124 MSC controller. NVRAM, like PCM memory devices, resides on PCI Express cards or risers that use PCI Express electrical connections and the PCI Express protocol or a different transactional protocol over the 454 PCI Express bus. PCM devices on PCI Express cards or 455 risers provide the far memory capacity of this split architecture, with the memory channel interfaces outside of the 401 CPU package providing multiple DDR 440 channels to the DDR DRAM 452 DIMMs.
[00127] Figure 4E illustrates a unified architecture that hosts both near memory acting as MSC and far memory NVRAM as PCM on PCI Express cards or 456 risers that use PCI Express electrical connections and the PCI Express protocol or a different transactional protocol over the 454 PCI Express bus. The 310 processor hosts the 331 integrated memory controller, but in this case of the unified architecture, the 124 MSC controller resides on the 456 card or riser, along with the near memory of the memory DRAM and NVRAM far.
[00128] Figure 4F illustrates a unified architecture that houses both near memory acting as an MSC and far memory NVRAM, such as PCM, in 458 DIMMs using 457 DDR channels. Near memory in this unified architecture comprises the DRAM on each DIMM 458, acting as the memory-side cache for the PCM devices on the same DIMM 458, which make up the far memory of that particular DIMM. The MSC 124 controller resides on each DIMM 458, along with near and far memory. In this mode, multiple memory channels of a DDR 457 bus are provided outside the CPU package. The DDR 457 bus of this variant implements a transactional protocol over DDR electrical connections.
[00129] Figure 4G illustrates a hybrid split architecture, in which the MSC 124 controller resides on the 310 processor and both far and near memory interfaces share the same DDR 410 bus. This configuration uses DIMMs based on DDR DRAM 411a as near memory, acting as an MSC with 411b PCM-based DIMMs (ie far memory) residing on the same memory channel as the DDR 410 bus, using DDR and NVRAM slots (as PCM memory devices). Memory channels of this modality carry both DDR and transactional protocols simultaneously to handle near memory and far memory DIMMs, 411a and 411b, respectively.
[00130] Figure 4H illustrates a unified architecture in which the near 461a memory acting as memory-side cache resides in a 461 mezzanine or riser, in the form of DIMMs based on DDR DRAM. The memory side cache controller (MSC) 124 is located on the riser DDR and the PCM 460 controller which can have two or more memory channels connecting to the DDR 470 DIMM channels on the mezzanine/riser 461 and interconnecting to the CPU over 462 high-performance interconnects, such as a differential memory link. The associated far memory 461b resides in the same mezzanine/riser 461 and is made up of DIMMs that use DDR channels 470 and are populated with NVRAM (such as PCM devices).
[00131] Figure 4I illustrates a unified architecture that can be used as memory capacity expansion for a DDR memory subsystem and 464 DIMMs connected to the 401 CPU package in its DDR memory subsystem, through a DDR 471 bus. For the additional NVM-based capacity in this configuration, near memory acting as an MSC resides on a 463 mezzanine or riser, in the form of 463A DRAM-based DDR DIMMs. The 124 MSC controller is located in the riser DDR and the 460 PCM controller which can have two or more memory channels connecting to the 470 DDR DIMM channels in the mezzanine/riser and interconnecting to the CPU over high performance 462 interconnects, such as as a differential memory link. Associated far memory 463b resides in the same mezzanine/riser 463 and is made up of 463b DIMMs that use 470 DDR channels and are populated with NVRAM (such as PCM devices).
[00132] Figure 4J is a unified architecture, in which a near memory acting as a memory-side cache (MSC) resides in each DIMM 465, in the form of DRAM. The 465 DIMMs are on a 462 high-performance interconnect/channel, as a differential memory link, coupling the 401 CPU package with the MSC 124 MSC controller located on the DIMMs. The associated far memory resides in the same 465 DIMMs and is made up of NVRAM (such as PCM devices).
[00133] Figure 4K is a unified architecture, in which a near memory acting as a memory-side cache (MSC) resides in each DIMM 466, in the form of DRAM. The DIMMs are on the 470 high-performance interconnects connecting the 401 CPU package with the 124 MSC controller located on the DIMMs. The associated far memory resides in the same 466 DIMMs and is made up of NVRAM (such as PCM devices).
[00134] Figure 4L illustrates a split architecture that uses DIMMs based on DDR DRAM 464 on a DDR 471 bus to form the necessary near memory, which acts as an MSC. The processor 310 hosts the integrated memory controller 331 and the memory-side cache controller 124. NVRAM as PCM memory forms the far memory that resides on 467 cards or risers that use high-performance 468 interconnects communicating with the 401 CPU package using a transactional protocol. The 467 cards or risers hosting far memory host a single buffer/controller, which can handle multiple PCM-based memory DIMMs or multiple PCM-based DIMMs connected to that riser.
[00135] Figure 4M illustrates a split architecture that uses DRAM on a 469 board or riser to form the required near memory, which acts as an MSC. NVRAM, like PCM memory devices, forms far memory that also resides on 467 cards or risers that use high-performance 468 interconnects with the 401 CPU package. 469 cards or risers hosting far memory host temporary storage /simple controller, which can control multiple PCM-based devices or multiple PCM-based DIMMs on that riser 469 and also integrates memory-side cache controller 124.
[00136] In some of the embodiments described above, such as the one illustrated in Figure 4G, the DIMMS DDRA 411a and DIMMS based on PCM 411b reside in the same memory channel. Consequently, the same set of address/control and data lines are used to connect the CPU to both PCM and DRAM memories. To reduce the amount of data traffic through the CPU fabric interconnect, in one modality, a DDR DIMM on a common memory channel with a PCM-based DIMM is configured to act as the single MSC for data stored in the PCM-based DIMM. In such a configuration, far memory data stored in the PCM-based DIMM is only stored in the DDR near memory DIMM within the same memory channel, and therefore, locating the memory transactions for that particular memory channel.
[00137] Furthermore, to implement the above modality, the system address space can be logically subdivided between the different memory channels. For example, if there are four memory channels, then % of system address space can be assigned to each memory channel. If each memory channel is equipped with a PCMS-based DIMM and a DDR DIMM DIMM, the DDR DIMM can be configured to act as MSC for that % system address space.
[00138] The choice of system memory and mass-storage devices may depend on the type of electronic platform on which embodiments of the invention are used. For example, on a computer, tablet, notebook, smartphone, mobile phone, feature phone, PDA, portable media player, portable gaming device, game console, digital camera, switch, hub, router, set-top box, recorder For digital video or other devices that have relatively small mass storage requirements, mass storage can be implemented using NVRAM 152A mass storage, or using NVRAM 152A mass storage in combination with a flash/mass storage magnetic/optical. On other electronic platforms that have relatively large mass storage requirements (eg large servers), mass storage can be implemented using magnetic storage (eg hard drives) or any combination of magnetic storage, optical storage, holographic storage, mass storage flash memory, and NVRAM 152A mass storage. In this case, the storage system hardware and/or software can allocate persistent program code and data blocks between the FM 151B/NVRA 152A storage and a 152B flash/magnetic/optical mass storage in an efficient or useful manner.
[00139] For example, in one modality, a high power server is configured with a near memory (eg DRAM), a PCMS device, and a magnetic mass storage device for large amounts of persistent storage. In one embodiment, a portable computer is configured with near memory and a PCMS device that performs the far memory and mass storage device function (that is, it is logically partitioned to perform these functions, as shown in Figure 3). An embodiment of a home or office computer is configured similarly to a laptop computer, but may also include one or more magnetic storage devices that provide large amounts of persistent storage capabilities.
[00140] A modality of a tablet computer or cell phone device is configured with PCMS memory but potentially no near memory and no additional mass storage (for cost/energy savings). However, the tablet/phone can be configured with a removable mass storage device such as a PCMS flash or pen drive.
[00141] Various other types of devices can be configured as described above. For example, portable media players and/or PDAs can be configured similarly to the tablets/phones described above, game consoles can be configured similarly to workstations or laptops. Other devices that can be similarly configured include digital cameras, routers, set-top boxes, digital video recorders, televisions and automobiles. MODALITIES OF MSC ARCHITECTURE
[00142] In an embodiment of the invention, the mass of DRAM in system memory is replaced with PCM. As discussed earlier, PCM provides significant improvements in memory capacity at a significantly lower cost compared to DRAM and is non-volatile. However, certain characteristics of PCM such as asymmetric read vs. write performance, write cyclic resistance limits, as well as its non-volatile nature, make it difficult to directly replace DRAM without requiring major software changes. The embodiments of the invention described below provide a transparent way for the software to integrate the PCM, also allowing for new uses through software enhancements. These modalities promote a successful transition in memory subsystem architecture and provide a way to consolidate memory and storage using a single PCM pool, thus alleviating the need for a separate nonvolatile storage tier on the platform.
[00143] The particular embodiment illustrated in Figure 5A includes one or more processor cores 501, each with an internal memory management unit (MMU) 502 for generating memory requests and one or more internal CPU caches. 503 to store lines of program code and data according to a specified cache management policy. As mentioned earlier, the cache management policy can include a unique cache management policy (where any row present at a particular cache level in the hierarchy is not present at any other cache level) or a cache management policy inclusive (where duplicate cache rows are stored at different levels of the cache hierarchy). The specific cache management policies that can be used for managing 503 internal caches are well understood by those skilled in the art and as such will not be described here in detail. The underlying principles of the invention are not limited to any specific cache management policy.
[00144] Also illustrated in Figure 5A is a "local agent" 505, which gives access to the MSC 510 generating memory channel addresses (MCAS) for memory requests. Local agent 505 is responsible for managing a specified memory address space and resolves memory access conflicts directed at that memory space. Thus, if any core needs to access a certain address space, it will send requests to that home agent 505 who will then send the request to that particular MMU 502. In one modality, a home agent 505 is assigned per MMU 502; however, in some embodiments, a single local agent 505 may serve more than one memory management unit 502.
[00145] As illustrated in Figure 5A, an MSC 510 is configured in front of the PCM 519-based far memory. The MSC 510 manages access to near 518 memory and forwards memory access requests (eg reads and writes) to the far 521 memory controller when appropriate (eg when requests cannot be serviced from near 518) memory. The MSC 510 includes a 512 cache control unit, which operates in response to a cache of 511 tags, which stores the tags identifying the cache lines contained in near 518 memory. In operation, when the unit Cache Control Control 512 determines that the memory access request can be made from near memory 518 (for example, in response to a cache hit), which generates a near memory address (NMA) to identify the stored data in near 518 memory. A near 15 memory control unit interprets the NMA and responsibly generates electrical signals to access near 518 memory. As mentioned earlier, in one embodiment, near memory is dynamic random access memory (DRAM). In such a case, the electrical signals may include row address strobe (RAS) signals and column address strobe (CAS) signals. It should be noted, however, that the underlying principles of the invention are not limited to the use of the DRAM as near memory.
[00146] Another component that ensures software transparent memory application is an optimized PCM 521 far memory controller, which manages the PCM 530 far memory characteristics while continuing to provide the required performance. In one embodiment, the PCM 521 controller includes an Address Indirection Table 520 that converts the MCA generated by the cache control unit 515 into a PDA, which is used to directly address the PCM 530's far memory. granularity of a "block", which is normally 5 KB. Translation is necessary as, in one modality, the far 521 memory controller continuously moves PCM blocks throughout the PCM device address space to ensure that there is no wear of hot spots due to a high frequency of writes on any specific block . As described above, such a technique is sometimes referred to herein as "use leveling".
[00147] Thus, the MSC 510 is managed by the cache control unit 512 which allows the MSC 510 to absorb, coalesce and filter transactions (eg reads and writes) into the far memory of the PCM 530. The control unit 512 cache controller manages all data movement and consistency requirements between far 518 memory and near PCM 530 memory. In addition, in one modality, the MSC 512 cache controller interfaces with the CPUs and provides the load/interface. standard synchronous storage used in traditional DRAM-based memory subsystems.
[00148] Examples of read and write operations will now be described in the context of the architecture shown in Figure 5A. In one embodiment, the read operation will arrive at the MSC controller 512, which will perform a search to determine if the requested data is present (for example, using tag cache 511). If present, it will return data to the CPU, 101-104 core, or I/O device (not shown). If data is not present, the MSC 512 controller will send the request, along with the system memory address (also referred to herein as the memory channel address or MCA) to the PCM 521 far memory controller. The PCM 521 will use the Address Indirection Table 520 to translate the address for a PDA and direct the read operation to this region of the PCM. Upon receiving the requested data from the PCM 530 far memory, the PCM 521 controller will return the requested data to the MSC 512 controller, which will store the data in the MSC 518 near memory and will also send the data to the 501 core of the CPU or requesting I/O device. Subsequent requests for this data can be serviced directly from the near memory of the MSC until they are replaced by other data from the PCM.
[00149] In one mode, a memory write operation also goes first to the MSC 512 controller, which writes to the near memory of the MSC 518. In this mode, data cannot be sent directly to the far memory of the PCM 530 when a write operation is received. For example, data can be sent to PCM far memory only when the MSC 518 near memory location in which the data is stored is to be reused for storing data to a different system memory address. When this happens, the MSC 512 controller notices that the data is not present in the PCM 530 far memory, so it retrieves it from the near 518 memory and sends it to the PCM 521 controller. The PCM 521 controller searches the PDA for the system memory address and then writes the data to the far memory of the PCM 530.
[00150] In one embodiment, the MSC 518 near memory size will be governed by the workload memory requirements as well as the near and far memory performance. For a DRAM based MSC, the size can be adjusted to one-tenth the size of the workload memory space or the far memory size of the PCM 530. Such MSC is very large compared to conventional caches found in current processor/processor architectures. system. By way of example, and not limitation, for a 128GB PCM far memory, the MSC near memory size can be as large as 16GB.
[00151] Figure 5B illustrates additional details associated with a modality of MSC 510. This modality includes a set of logical units responsible for commands and addressing, including a command staging monitoring unit 542 for command staging /addresses and a 544 cache access mode control unit that selects an MSC operating mode in response to a control signal from an MSC 545 Interval Register (RR) unit. Several examples of operating modes are described below. Briefly, they can include modes where near memory is used in a traditional caching role and modes where 518 near memory is part of system memory. A tag check/command programmer 550 uses tags from the tag cache 511 to determine if a particular cache row is stored in near 518 memory, and a near 515 memory controller generates channel address signals (e.g., CAS and RAS).
[00152] This modality also includes a set of logical units responsible for data routing and processing, including a temporary data storage set 546 to store data collected from near memory or stored in near memory. In one embodiment, a pre-acquisition data cache 547 is also included to store pre-acquired data from near memory and/or far memory. However, pre-acquisition data cache 547 is optional and is not required to fulfill the underlying principles of the invention.
[00153] An error correction code generator/verifier (ECC) 552 unit generates and verifies ECCs to ensure that data written to or read from near memory is error free. As discussed below, in one embodiment of the invention, the ECC generator/verifier unit 552 is modified to store in the cache tags. The specific ECCs are well understood by those skilled in the art and therefore will not be described in detail here. The 553 channel controller couples the data bus from the near 518 memory to the MSC 510 and generates the electrical signals necessary to access the near 518 memory (for example, the RAS and CAS signaling to a near DRAM memory).
[00154] Also illustrated in Figure 5B is a 548 far memory control interface for coupling the MSC 510 to far memory. In particular, the 548 far memory control interface generates the MCAs necessary to address far memory and transfer data between the 546 data buffers and far memory.
[00155] As mentioned, the near 518 memory used in one modality is very large compared to conventional caches found in current processor/system architectures. Consequently, the 511 tag cache that maintains the translation of system memory addresses to near memory addresses can also be very large. The cost of storing and fetching MSC tags can be a significant impediment to building large caches. As such, in one embodiment of the invention, this problem is solved with an innovative system that stores the cache tags in storage allocated in the MSC for ECC protection, and thus essentially removes the cost of storing the tags.
[00156] This mode is generally illustrated in Figure 5C which shows an integrated unit of cache tags and ECC 554 to store/manage cache tags, store ECC data and perform ECC operations. As illustrated, the stored tags are provided to the tag check/command programmer 550 upon request when performing tag check operations (e.g., to determine whether a particular block of data is stored in the cache of near memory 518).
[00157] Figure 5D illustrates the organization of an example dataset 524 and a corresponding ECC 523 and tag 522. As illustrated, the tag 522e is co-located with the ECC 523 in a tag cache unit/ECC 554 memory (eg, DDR DRAM in one modality). In this example, several blocks of data totaling 64 bytes were loaded into the tag cache/ECC unit 554. A verification/generation unit of ECC 554a generates an ECC using the 525 data and compares the generated ECC with the existing ECC 523 associated with the Dice. In this example, a 4 Byte ECC is generated for the 64 bytes of data 525. However, the underlying principles of the invention are not limited to any particular type or size of ECC. Furthermore, it should be noted that the term "data" is used here widely to refer to executable program code and data, both of which can be stored in data store 525 illustrated in Figure 5D.
[00158] In one embodiment, a 3-byte (24-bit) tag 522 is used with the bit assignments illustrated in Figure 5D. Specifically, bits 00-16 are address bits, which provide the top address bits of the cache line. For a 56-bit system address (eg SPA [55:00]), bits 00-16 map to bits 55-29 of the system address, allowing for the smallest cache size of 512 MB. Returning to the 3-Byte tag, bits 17-19 are reserved; bits 20-21 are directory bits that provide information about the cache line's remote CPU cache (for example, giving an indication about other CPUS on which the line is stored); bits 21-22 indicate the current state of the cache line (for example, 00 = clean; 01 = dirty; 10 and 11 = not used); and bit 23 indicates whether the cache line is valid (for example, 1 = valid; 0 = invalid).
[00159] Using a directly mapped cache architecture as described above, which allows the near memory address to be extracted directly from the system memory address reduces or eliminates the latency cost of tag lookup before the MSC 510 can be read, thus significantly improving performance. In addition, the time to check the cache tags to decide if the MSC 510 has the necessary data is also eliminated, as it is done in parallel with the ECC check of the data read from the MSC.
[00160] Under certain conditions, storing tags with the data can create a problem for recordings. A write reads data first to ensure that it will not overwrite data from some other address. Such reading before each write can get expensive. One embodiment of the invention utilizes a "dirty" line tag cache that holds newly addressed near memory address tags (NMAs). Since many writes target addresses that were recently accessed, a reasonably small tag cache can have an effective hit ratio for filtering out most reads before a write.
[00161] Additional details relating to the modality of a PCM 519 DIMM, including a PCM 521 far memory controller and a set of PCM 530a-i far memory modules are illustrated in Figure 5E. In one embodiment, a single PCM 530a-i far memory pool is dynamically shared between the system's memory and storage uses. In this mode, the entire 530a-i PCM group can be subdivided into 4KB "blocks". A PCM Descriptor Table (PDT) 565 identifies the use of each PCM block as memory or storage. For example, each PDT row can represent a particular block, with a certain column to identify the usage of each block (for example, 1 = memory; 0 = storage). In this mode, the initial system configuration can partition the PCM blocks within the 530a-i PCM between storage and memory usage (ie, by programming the PDT 565). In one modality, the same table is used to exclude bad blocks and provide extra blocks for usage leveling operations. In addition, the PDT 565 can also include mapping each PCMS block to a "logical" block address used by the software. In the case of system memory, the logical block address is the same as the MCA or SPA. This association is required to update Address Indirection Table (AIT) 563 whenever the PCMS block is moved due to usage leveling. When this happens, the logical block address used by the software needs to be mapped to a different PCMS Device Address (PDA). In one modality, this mapping is stored in the AIT and is updated on each usage leveling move.
[00162] As illustrated, the controller of PCM 521 includes a system physical address mapper (SPA) for PCM 556, which operates in response to a usage management unit 555 and an address indirection unit 563 for mapping SPAs for PCM blocks. In one embodiment, the 555 usage management logic implements a usage leveling algorithm to account for the fact that the 530a-530i PCM's storage cells begin to wear out after many write and/or erase accesses. Usage leveling spreads writes and erases throughout the memory cells of the PCM device, for example, forcing data blocks with low cycle counts to occasionally move, and thus allowing data blocks with high Cycle count are placed in memory cells that store the data blocks with low cycle count. Typically, most blocks do not cycle, but blocks with high cycle counts are more likely to fail and usage leveling swaps the addresses of high cycle count blocks for low cycle count blocks. The 555 usage management logic can control cycle counts using one or more counters and registers (for example, the counters can increase by 1 whenever a cycle is detected and the result can be stored in the register set).
[00163] In one embodiment, the address indirection logic 563 includes an address indirection table (AIT), containing the indication of the PCM blocks to which the write operations are to be directed. AIT can be used to automatically move blocks between memory and storage uses. From a software perspective, accesses to all blocks use traditional load/store memory semantics (ie, usage leveling and address indirection operations occur transparently to the software). In one modality, the AIT is used to translate the SPA, which is generated by PDA software. This translation is necessary, as well as the need to uniformly wear out PCMS devices, data will need to be moved in PDA space to avoid hotspots. When such a move occurs, the relationship between SPA and PDA will change, and the AIT will be updated to reflect this new translation.
[00164] After mapping the SPA to PCM, a 557 programming unit schedules the underlying PCM operations (eg reads and/or writes) with the 530a-I PCM devices and a 558 PCM protocol engine generates the necessary electrical signaling to perform read/write operations. An ECC unit 562 performs the error detection and correction operations and temporary data stores 561 temporarily stores the data being read or written to the PCM devices 530a-I. A persistent write buffer 559 is used to keep data that is guaranteed to be written back to PCMS even in case of an unexpected power failure (for example, it is implemented using nonvolatile storage). Release support logic 560 is included to flush the persistent write buffers to the PCMS, either periodically and/or according to a specified data flush algorithm (for example, after the persistent write buffers reach a threshold specified).
[00165] In one embodiment, the MSC 510 automatically forwards storage accesses directly to the PCM 521 far memory controller and memory accesses to the MSC 512 cache control unit. PCM 521 far are treated as normal reads and writes, and the address indirection and usage leveling mechanisms described here are applied as usual. A further optimization is used in an embodiment of the invention, which can be implemented when data must move between memory and storage. Since a common 530a-I PCM group is used, data movement can be eliminated or deferred simply by changing the pointers in the translation tables (eg the AIT). For example, when data is transferred to storage memory, a pointer identifying data from a particular PCM physical storage location can be updated to indicate that the same PCM physical storage location is now an in-memory memory location of the system. In one embodiment, this is accomplished by hardware in a way transparent to software to provide performance and power benefits.
[00166] In addition to the software transparent mode of operation, a controller mode of MSC 512 provides alternative modes of operation, as indicated by the interval registers (RRs) of the MSC 545. These modes of operation include, but are not limited to, following:
[00167] Direct PCM memory access for storage class applications. Such use will also require the MSC 512 controller to ensure that writes presented to PCM 519 are actually committed to a persistent state.
[00168] Hybrid use of near 518 memory, exposing parts of it to software for direct use, keeping the rest as an MSC. When part of near 518 memory is exposed to software for direct use, that part is directly addressable within the system address space. This allows certain applications to explicitly divide their memory allocation between a small, high-performance region (near 518 memory) and a larger region of relatively smaller performance (far 530 memory). On the other hand, the part allocated as cache within the MSC is not part of the system address space (but instead acts as a far 530 memory cache, as described here).
[00169] As discussed previously, the MSC architecture is defined in such a way that several different approaches to system partitioning are possible. These approaches fall into two general containers:
[00170] Split Architecture: In this scheme the MSC 512 controller is located on the CPU and intercepts all system memory requests. There are two separate MSC interfaces that come out of the CPU to connect Near Memory (eg DRAM) and far memory (eg PCM). Each interface is tailored to the specific type of memory and each memory can be scaled independently in terms of performance and capacity.
[00171] Unified Architecture: In this scheme, a single memory interface leaves the CPU and all memory requests are sent to this interface. The MSC 512 controller, along with the near memory subsystem (eg DRAM) and far (eg PCM) are consolidated outside the CPU on this single interface. In one modality, this memory interface is tailored to meet the CPU's memory performance requirements and supports an out-of-order transactional protocol. Near and Far memory requirements are met in a "unified" way over each of these interfaces.
[00172] Within the scope of the above containers, several partitioning options are possible, some of which are described below. Example of division:
[00173] Near memory: DDR5 DIMMs
[00174] Near memory interface: One or more DDR5 channels
[00175] Far memory: PCM/device controller on a PCI Express card (PCIe)
[00176] Far memory interface: PCIe x16, Gen 3 2) Unified example:
[00177] CPU memory interface: one or more KTMI (or QPMI) channels
[00178] Near/far memory with MSC/PCM controller on a riser card
[00179] Near memory interface outside the MSC/PCM controller: DDR5 interface
[00180] Far memory interface outside of MSC/PCM controller: PCM device interface MODALITIES WITH DIFFERENT MODES OF OPERATION OF NEAR MEMORY
[00181] As discussed above, a two-level memory hierarchy can be used for introducing fast non-volatile memory, such as PCM as system memory, while using very large DRAM-based near memory. Near memory can be used as a hardware-managed cache. However, some applications are not friendly to hardware caching and as such would benefit from alternative ways of using such memory. As there are several different applications running on a server at any given time, one embodiment of the invention allows multiple usage modes to be activated simultaneously. In addition, a modality provides the ability to control the allocation of near memory for each of these usage modes.
[00182] In one mode, the MSC 512 controller provides the following modes to use near memory. As mentioned earlier, in a modality, the current operating mode can be specified by operating codes stored in the interval registers (RRs) of the MSC 545.
[00183] Write-back caching mode: In this mode, all or parts of near 518 memory are used as cache for PCM 530 memory. While in write-back mode, each write operation is initially directed to near 518 memory (assuming the cache line to which the write is directed is present in the cache). The corresponding write operation is performed to update the PCM 530 far memory only when the cache line in near 518 memory has to be replaced by another cache line (in contrast to the write-through mode described below, in which each operation recording is immediately propagated to memory far 530).
[00184] In one embodiment, a read operation will first arrive at the controller of the MSC 124, which will perform a lookup to determine if the requested data is present in the far memory of the PCM 518 (e.g., using a tag cache 511). If present, it will return data to the requesting CPU, core 501 or I/O device (not shown in Figure 5A). If the data is not present, the MSC 124 controller will send the request along with the system memory address to a PCM 521 far memory controller. The PCM 521 far memory controller will translate the system memory address to an address of physical device PCM (PDA) and will direct the read operation to this region of far 530 memory. As mentioned earlier, this translation can utilize an address indirection table (AIT) 563 that the PCM 521 controller uses to translate between addresses. system memory and PCM PDAs. In one modality, the AIT is updated as part of the usage leveling algorithm implemented to distribute memory access operations and thus reduce the FM usage of PCM 530.
[00185] Upon receiving the requested data from the PCM 530 far memory, the PCM 521 far memory controller will return the requested data to the MSC 512 controller, which will store them in the near memory of the MSC 518 and also send them to the core of 501 processor or requesting I/O device (not shown in Figure 5A). Subsequent requests for this data can be serviced directly from near 518 memory until they are overwritten by other PCM far memory data.
[00186] In one mode, a memory write operation also goes first to the MSC 512 controller, which writes to the near memory of the MSC 518 acting as the FM 518 cache. In this mode, data cannot be sent directly to memory PCM 530 far when a write operation is received. For example, data can be sent to the PCM 530 far memory only when the MSC near memory acting as the FM 518 cache in which the data is stored is to be reused for storing data to a memory address of different system. When this happens, the MSC 512 controller notices that the data is not present in the PCM 530 far memory, so it retrieves it from the near memory acting as the FM 518 cache and sends it to the PCM 521 far memory controller. The PCM 521 controller searches the PDA for the system memory address and then writes the data to the PCM 530 far memory.
[00187] (2) Near memory bypass mode: In this mode, all reads and writes bypass the NM acting as a cache of FM 518 and go directly to the far memory of PCM 530. This mode can be used for example , when an application is not cache-friendly or when it requires data to be compromised with persistence at the granularity of a cache row. In one embodiment, the storage performed by the 503 processor caches and the NM acting as an FM 518 cache operate independently of each other. Consequently, data can be cached on the NM acting as a 518 FM cache that is not stored in the 503 processor caches (and which, in some cases, may not be allowed to be stored in the 503 processor cache) and vice versa. versa. Thus, certain data that might be designated as "uncacheable" in processor caches 503 can be cached within the NM acting as an FM cache 518.
[00188] (3) Near Memory Read-Cache Write Bypass Mode: This is a variation of the above mode, where persistent read storage of PCM 519 data is allowed (ie persistent data is cached in MSC 510 for read-only operations). This is useful when most persistent data is "read-only" and the application is cache friendly.
[00189] (4) Near Memory Read-Cache Write-Through Mode: This is a variation of the previous mode, where, in addition to reading the storage, the "write-hits" (write hits) are also cached. Each write to the near memory of the MSC 518 causes a write to the far memory of the PCM 530. And so, due to the write-through nature of the cache, the persistence of the cache line is still guaranteed.
[00190] (5) Near memory direct access mode: In this mode, all or parts of near memory are directly visible to software and are part of the system memory address space. This memory can be completely under the control of the software. Any data movement from PCM 519 memory to the near memory region requires explicit software copies. Such a scheme can create a non-uniform memory address (NUMA) memory domain for software, where it gets higher performance from near 518 memory over PCM 530 far memory. Such utilization can be employed for certain high-end computing applications performance (HPC) and graphics applications that require very fast access to certain data structures. This mode of direct near memory access is equivalent to "pinning" certain cache lines into near memory. Such pinning can be done effectively in larger, multi-way, and set-associative caches.
[00191] Table A below summarizes each of the modes of operation described above.
[00192] TABLE A

[00193] The processor and chipset components used to implement the above modes of operation include the following:
[00194] A 512 memory-side cache controller that manages near memory in a two-tier (2LM) memory hierarchy.
[00195] A set of range registers 545 (see Figure 5B) in the memory-side cache 510 that determines the system address ranges for each of the modes of operation described above.
[00196] A mechanism for recognizing write completions from the PCM 519 memory subsystem to the MSC 515 controller.
[00197] A mechanism to invalidate rows in near 518 memory.
[00198] A release mechanism to revert dirty lines to PCM and invalidate in specific regions of the near memory address space.
[00199] In one embodiment, the memory ranges for each of the usage modes are contiguous in the system address space. However, multiple disjoint regions can use the same mode. In a modality, each mode interval record within the MSC 545 RR set provides the following information:
[00200] the mode of operation (eg write-back, near memory bypass mode, etc.);
[00201] the basis of ranges in the system address space (for example, at a granularity of 2 MB or greater); and
[00202] a range mask field that identifies the size of the region.
[00203] In a modality, the number of supported modes is implementation-specific, but it is assumed that only one contiguous system address range is available for each of the operating modes. If a direct near memory access range register is specified, it is assumed to be mapped to a contiguous region starting at the bottom of the near memory address space. Such contiguous region must be smaller than the near memory size. Also, if any of the cache modes are being used, the direct access region size must be smaller than the near memory size to allow proper cache sizing for proper performance. This near memory allocation for various modes is user configurable.
[00204] In short, a modality of the invention is performed according to the following set of operations:
[00205] When any read or write arrives at the memory side cache controller 512, it checks the interval registers 545 (Figure 5B) to determine the current operating mode.
[00206] For any cache read/write bypass access, the MSC 512 controller checks if the address is cached. If so, it must invalidate the line before sending write completion back to the source.
[00207] For any "Write Bypass direct PCM" operation, the MSC 512 controller waits for a completion back from the PCM 521 controller to ensure the write is committed to a globally visible buffer.
[00208] Any read or write mode space for direct access in near memory is directed to the appropriate region of near memory. No transactions are sent to PCM memory.
[00209] Any change to the Range Register setting to increase or decrease any existing region or add a new region will require releasing the appropriate cached regions to PCM. For example, if the software intends to increase the size of the Direct Access mode region by reducing the Write-Back Cache region, it can do so by reversing and invalidating the appropriate part of the near memory region, and then changing the register of near memory direct access mode ranges. The MSC 510 controller will then know that future caching is done in a space that future caching is done to a smaller near memory address space.
[00210] A particular embodiment of the invention in which the system physical address space (SPA) is divided between multiple MSCs is illustrated in Figure 6A. In the illustrated embodiment, the MSC cache 654 and the controller 656 are associated with the SPA region 667a; the MSC 655 cache and the 657 controller are associated with the SPA 667b region; the MSC 661 cache and the 663 controller are associated with the SPA 667c region; and the MSC 660 cache and controller 662 are associated with the SPA 667d region. Two CPUs, 670 and 671, are illustrated, each with four cores, 650 and 651, respectively, and a local agent, 652 and 653, respectively. The two CPUs, 670 and 671, are coupled to a common far memory controller 666 via far memory interfaces, 659 and 665, respectively.
[00211] Thus, in Figure 6A, the entire SPA memory space is subdivided into regions, in which each region is associated with an MSC and a particular controller. In this modality, a given MSC can have a non-contiguous SPA space allocation, but two MSCs will not have overlapping SPA spaces. Furthermore, MSCs are associated with a non-overlapping SPA space and do not require coherence techniques between MSCs.
[00212] Any of the near memory modes described above can be used in the architecture shown in Figure 6A. For example, each MSC controller 656-657, 662-663 can be configured to operate in Write-Back Cache Mode, Near Memory Bypass Mode, Near Memory Read-Cache Write Bypass Mode, Near Memory Read-Cache Write-Mode. Through or Near memory direct access mode. As discussed earlier, the particular mode is specified within the interval register (RR) 655 of each MSC 610.
[00213] In one modality, different MSCs can simultaneously execute different modes of operation. For example, MSC 656 controller interval registers can specify near memory direct access mode, MSC 657 MSC controller interval registers can specify Write Back Cache mode, MSC 662 controller interval registers can specify Read Cache/Write Bypass mode and the MSC 663 controller can specify Read Cache/Write Through mode. Also, in some modalities, individual MSCs can simultaneously implement different modes of operation. For example, the MSC 656 controller can be configured to implement a direct near memory access mode for certain system address ranges and a near memory bypass mode for other system address ranges.
[00214] The above combinations are mere examples of the ways in which MSC controllers can be programmed independently. The underlying principles of the invention are not limited to these or any other combinations.
[00215] As described in relation to some of the modalities described above (for example, as described in relation to Figure 4G), an MSC and its MSC controller are configured to operate on the same memory channel (for example, the same DDR bus physical) as the responsible PCM DIMM for that particular SPA interval. Consequently, in this mode, memory operations that occur within the designated SPA range are located within the same memory channel, thus reducing data traffic through the CPU mesh interconnection.
[00216] Figure 6B provides a graphical representation of how the system memory address map 620, the near memory address map 621 and the PCM address map 622 may be configured in accordance with embodiments of the invention. As previously discussed, MSC controller 606 operates in a mode identified by interval registers (RRs) 605. System memory map 620 has a first region 602 allocated for direct near memory access mode, a second region 603 allocated for near memory bypass mode, and a third region 605 allocated for writeback cache mode. The MSC controller 606 provides near memory access as indicated by the near memory address map 621, which includes a first region 608 allocated for a Write Back Cache mode, and a second region 609 allocated for a direct near memory access mode . As illustrated, Near Memory Cache Bypass operations are provided directly to the PCM 610 controller operating according to the PCM 622 address map, which includes a Near Memory Bypass region 611 (for near memory bypass mode) and a Write-Back Cache region 612 ( for Write-Back Cache mode). Consequently, the system memory map 620, near memory address map 621, and the PCM address map 622 can be subdivided based on the specific modes implemented by the MSC controllers.
[00217] Figures 6C and 6D illustrate addressing techniques used in an embodiment of the invention (some of which may have already been described in a generic way). In particular, Figure 6C shows how a system physical address (SPA) 675 is mapped to a near memory address (NMA) or a PCM device address (PDA). In particular, the SPA is decoded by decoding logic 676 within a processor to identify a home agent 605 (eg, the home agent responsible for the decoded address space). The decoding logic 677 associated with the selected local agent 605 further decodes the SPA 675 (or part of it) to generate a memory channel address (MCA) that identifies an appropriate MSC 612 cache controller assigned to that SPA space particular. Selected cache controller 612 then maps the memory access request to a near memory address 678, optionally followed by an interleaving operation at 680 (described below) or, alternatively, performs an optional interleaving operation at 679, followed by mapping 681 by the PCM far memory controller to a PCM device address PDA (for example, using address indirection and usage management as described above).
[00218] One modality of an optional interleaving process is illustrated in Figure 6D, which shows how software pages can be divided into multiple MSCs and PCM address spaces using interleaving. In the example shown in Figure 6D, two 682-683 pages in the SPA space are interleaved by the 685 cache row interleaving logic to generate two sets of 685-686 interleaved rows within the MCA space. For example, all odd lines from memory pages 682-683 (eg lines 1, 3, 5, etc.) can be sent to a first MCA 685 space, and all even lines from memory pages 682-683 (eg lines 2, 5, 6, etc.) can be sent to a second MCA 686 space. In one embodiment, pages are 5 kBytes, although the underlying principles of the invention are not limited to any page size . The 687-688 PCM controllers that operate in compliance with address indirection tables (AITs) and usage management logic, then rearrange the cache lines within the PCM device address memory (PDA) space (as described above ). Interleaving of this nature can be used to distribute the workload across MSCs 610 and/or PCM devices 619 (for example, as an alternative to non-uniform memory address (NUMA)). SYSTEM AND METHOD FOR THE INTELLIGENT RELEASE OF DATA FROM A PROCESSOR TO A MEMORY SUBSYSTEM
[00219] In current processor designs, when the processor cache is flushed, no information is provided to the memory subsystem to differentiate data that is no longer needed by the processor (and can therefore be discarded) and data that must be saved. As a result, all released data is saved. The performance of new architectures such as those using PCM or, more specifically, PCMS memory, can be improved if such information is provided to the memory subsystem. For example, such information can be used to reduce the number of write operations to reduce PCMS memory usage, particularly when used in combination with existing usage leveling techniques.
[00220] Usage leveling was previously discussed in detail in relation to Figure 5E, which shows a PCM 521 controller (a PCMS controller in one mode) with a system physical address mapper (SPA) to PCM 556 operating in response to a usage management unit 555 and an address direction unit 563 for mapping SPAs to PCM blocks. In one embodiment, the 555 usage management logic implements a usage leveling algorithm to account for the fact that the 530a-530i PCM's storage cells begin to wear out after many write and/or erase accesses. Usage leveling spreads writes and erases throughout the memory cells of the PCM device, for example, forcing data blocks with low cycle counts to occasionally move, and thus allowing data blocks with high Cycle count are placed in memory cells that store the data blocks with low cycle count. Typically, most blocks do not cycle, but blocks with high cycle counts are more likely to fail and usage leveling swaps the addresses of high cycle count blocks for low cycle count blocks. The 555 usage management logic can control cycle counts using one or more counters and registers (for example, the counters can increase by 1 whenever a cycle is detected and the result can be stored in the register set).
[00221] In one embodiment, the address indirection logic 563 includes an address indirection table (AIT), containing the indication of the PCM blocks to which write operations should be directed. AIT can be used to automatically move blocks between memory and storage uses. From a software perspective, accesses to all blocks use traditional load/store memory semantics (ie, usage leveling and address indirection operations occur transparently to the software). In one modality, the AIT is used to translate the SPA, which is generated by PDA software. This translation is necessary, as well as the need to uniformly wear out PCMS devices, data will need to be moved in PDA space to avoid hotspots. When such a move occurs, the relationship between SPA and PDA will change, and the AIT will be updated to reflect this new translation.
[00222] In one embodiment, read and write data buffers 561 are used to read data from PCMS 530a-I and write data to PCMS to maximize PCMS parts life, increase read/ writing and reduce power consumption. If cache clearing information as described here is not provided to the PCM 521 controller, it cannot differentiate data that is no longer needed by the processor from what is currently in use, thus reducing the performance of PCMS 530a- memory. i.
[00223] The embodiments of the invention described herein provide techniques for communicating memory release information to the memory subsystem, in order to improve the performance of the PCMS memory, reduce the number of write operations for the PCMS. Memory release information also ensures that data that needs to be saved is written to the PCMS memory controller rather than remaining in PCMS temporary storage during a power cycle.
[00224] One embodiment of the invention uses the memory range registers 545 previously described with respect to Figure 5B-C to differentiate between DRAM memory (eg DDR) and memory with a 561 temporary storage interface such as the controller of PCMS memory 521. In one modality, when an interval record indicates that a PCMS device is used to cache a certain set of data, the processor generates "tips" to free memory to the memory subsystem when the processor cache it is implicitly or explicitly unloaded or when a memory free is requested. In one embodiment, hints are provided to the PCM 521 controller over a QuickPath Interconnect (QPI) type interface (a point-to-point processor interconnect designed by the assignee of this order, which replaces the front bus (FSB - front side bus) on certain processor architectures). It should be noted, however, that the underlying principles of the invention are not limited to any particular type of interface for exchanging data between a processor and a memory subsystem. The memory subsystem uses memory release hints to manage memory staging, ensuring that data is written to PCMS memory only when needed.
[00225] In one modality, a bit is added to the 545 memory range registers or, alternatively, to the memory type range registers (MTRRs) used in some processors to identify modes of access to certain memory ranges, such as such as "uncached" (UC), "write-through" (WT ), "write combining" (WC), "write-protect" (WP) and "write-back" (WB). The new bit is used to indicate whether you want to send data downloaded from memory to the memory controller. In one embodiment, when the processor frees a region of memory associated with the memory free hint interval, the processor sends additional information to the memory controller, including the memory free hint. When the memory controller receives the memory free hint, it writes the data in temporary storage back to the PCMS memory controller 521 (or real PCMS memory 530a-i). By way of example, and not as a limitation, the processor can use a special memory release instruction (MFLUSH) that identifies a memory page to be released and simultaneously (or later) provides a memory release hint to the memory controller. memory in which the memory page resides.
[00226] Figure 7 illustrates an example architecture, which includes multiple processors 704-705 that communicate with each other and with a PCMS 521 memory controller through a QPI interface. A south bridge module 706 is communicatively coupled to the CPUs via a direct media interface (DMI). In addition, each processor is communicatively coupled to its own 702-703 DDR memory (although DDR is not necessarily mandatory for compliance with the underlying principles of the invention).
[00227] Figure 8 illustrates an exemplary system memory map 801 for the system shown in Figure 7, which includes an entry of 810 mapping a first range of system memory addresses to DDR memory 702; an entry 811 mapping a second range of system memory addresses to DDR 703 memory; and an entry 812 mapping a third system address range to PCMS memory 521.
[00228] Figure 9 illustrates an example of memory slots record 545 containing memory release hint data 901 according to an embodiment of the invention. In the illustrated implementation, the memory free hint data 901 includes a memory start location 910 providing the start address of the memory region for which the memory free hint is provided, a memory length location 911 specifying the length of the memory region for which the memory free hint is provided, and a memory free on/off bit 912 which indicates whether the memory region identified by memory start location 910 and the length location 911 memory must be stored in PCMS memory in response to a release condition. In one modality, when the memory free hint is enabled for a certain memory range, the CPU sends memory free hints to the memory subsystem. Memory free hints can be generated through explicit software invocation or as part of the CPU removal algorithm.
[00229] Figure 10 illustrates a PCMS memory controller 521 according to an embodiment of the invention, which includes a system address to local memory address 1001 translator that translates system addresses to PCMS device addresses. In one embodiment, the system address to local memory address translator 1001 illustrated in Figure 10 includes the system physical address map module to PCMS addresses 556 and the address indication module 563 illustrated in Figure 5E. In addition, in one embodiment, a set of configuration registers 1002 includes memory range registers 545 to specify whether release hints will be provided for various system address ranges as described herein.
[00230] Also illustrated in Figure 10 is a read-write staging and PCMS protocol engine (both were previously described with respect to Figure 5E). Two separate memory channels are illustrated in Figure 10 (channels 0 and 1), each of which has "N" PCMS DIMMs. For example, 1006 DIMMs are assigned to memory channel 0 and 1007 DIMMs are assigned to memory channel 1. Of course, the underlying principles of the invention are not limited to any specific configuration of DIMMs and memory channels.
[00231] In one embodiment of the invention, the free hint processing logic 1005 associated with the read and write staging 561 makes use of the free memory hint to decide whether to offload the data from the temporary storage to the PCMS DIMMs 1006-1007 or keep the data in temporary storage.
[00232] Figure 11 illustrates an embodiment of a method for using the memory release hints. At 1101 the processor flushes its cache, and at 1102 a determination is made as to whether the memory flush hint is enabled (for example, via a bit in MRR or MTRR). If not, then at 1103, the system continues to run without the use of hints. If yes, then at 1104, a memory free hint containing the address for memory write or invalidate operations is sent to the PCMS memory controller.
[00233] At 1105, the memory controller receives the memory free hint from the processor and at 1106, the determination is made whether the memory free hint is received with the address. If not, then a normal response is sent to the processor at 1107. For example, a regular response according to a modality is published, while an acknowledgment is sent to the sender shortly after receiving the data back. When the memory free tip is received, the memory is written back to PCMS and then confirmation is sent to the sender. If so, then at 1108, the data stored in PCMS temporary storage is stored in PCMS memory according to the memory free hint (for example, as described above).
[00234] In modalities where the MFLUSH instruction is used, it can be designed as a blocking call if persistent storage is required. Performing a memory flush across the entire PCMS address range will ensure that all data in PCMS memory is returned to PCMS memory devices. Here, operations to free the PCMS memory address range can be specified with a memory barrier (MFENCE) instruction followed by an MFLUSH instruction indicating a specific page number.
[00235] The embodiments of the invention described herein transmit the memory release information to the memory subsystem, in order to improve the performance of the PCMS memory, reducing the number of recording operations for the PCMS. These modes also ensure that the data that needs to be saved is written to the PCMS memory controller (instead of remaining in the PCMS temporary storage during a power cycle).
[00236] In one modality, one or more range registers (RRs) are added to the processor's memory controller logic (either integrated in the processor package or a separate silicon), where the range indicates the addressable persistent memory address space per byte of the processor. For writes that target persistent memory space, as determined by the aforementioned interval registers, the memory controller logic ensures that the writes are certainly persistent (ie, the PCMS controller returns an acknowledgment indicating that the write is persistent). Note that the PCMS controller could also cache write protected against power failures before returning confirmation and later completing writing to PCMS memory. This is a PCM/PCMS controller implementation choice.
[00237] The mechanism by which software ensures that persistent memory writes are, in fact, persistent is called "durable". A persistent memory write is durable in a time when the contents of the write will be preserved regardless of a power cycle or restart conditions that might occur during the endurance point.
[00238] In the modality in which one or more interval registers are added to the memory controller logic, the software can ensure the durability by issuing a memory barrier or storage barrier instructions. These barrier instructions simply count the outstanding writes to the processor core and wait for the writes to complete by the memory controller logic. Since the memory controller logic will only issue completions if writes that fall within the interval registers are all completed for the corresponding PCMS controller, the completion of the barrier instruction is an indication to the software that its persistent memory writes are durable. . This scheme is illustrated in Figure 12.
[00239] At 1201, the processor issues one or more writes to persistent memory and issues memory barrier operations to ensure durability. If the write hits an address covered by the memory controller interval registers, determined at 1202, then at 1204 the memory controller logic issues a write to the PCMS controller and waits for acknowledgment. Otherwise, the processor continues with regular execution in 1203.
[00240] When an acknowledgment is received from the PCMS controller, determined at 1205, then at 1207, the processor completes the memory barrier operation. The processor waits for the acknowledgment at 1206 through 1204, until the acknowledgment is received.
[00241] The embodiments of the invention may include several steps, which have been described above. The steps can be incorporated into machine-executable instructions, which can be used to make a general purpose or special purpose processor perform the steps. Alternatively, these steps can be performed by specific hardware components that contain the connected logic to perform the steps, or by any combination of programmed computer components and custom hardware components.
[00242] As described in this document, instructions may refer to specific hardware configurations, such as application-specific integrated circuits (ASICs), configured to perform certain operations or have predetermined functionality or software instructions stored in embedded memory in a non-transient computer readable medium. Thus, the techniques shown in the figures can be implemented using code and data stored and executed in one or more electronic devices (eg, an end station, a network element, etc.). Such electronic devices store and communicate (internally and/or with other electronic devices over a network) code and data using computer readable media, such as computer readable non-transient storage media (eg, magnetic disks; optical disks; access memory random; read-only memory; flash memory devices; phase change memory) and computer-readable transient communication media (e.g., electrical, optical, acoustic or other form of propagated signals - such as carrier waves, infrared signals, digital signals, etc.). In addition, such electronic devices typically include an array of one or more processors coupled to one or more of the other components, such as one or more storage devices (non-transient machine-readable storage media), user input/output devices. (for example, a keyboard, touch screen and/or monitor) and network connections. The coupling of the set of processors and other components is usually done through one or more buses and bridges (also called bus controllers). The storage device and the signals carrying the network traffic, respectively, represent one or more machine-readable storage media and machine-readable communication media. Thus, the electronic device data storage device generally stores the code and/or data for the execution of the set of one or more processors of that electronic device. Naturally, one or more parts of an embodiment of the invention can be implemented using different combinations of software, firmware and/or hardware. Throughout this detailed description, for purposes of explanation, several specific details have been set forth in order to provide a thorough understanding of the present invention. It will be apparent to those skilled in the art that the present invention can be practiced without some of the specific details. In certain cases, well-known structures and functions have not been described in detail, so as to avoid obscuring the object of the present invention. Thus, the scope and spirit of the invention are to be judged in terms of the claims which follow.

权利要求:
Claims (29)
[0001]
1. Method for using memory freeing hints within a computer system, characterized by the fact that it comprises: unloading data from a processor cache; determining whether memory free hints are enabled for a given range of system addresses allocated to a phase-change memory ("PCM") memory device; if memory free hints are enabled for the specified system address range, then generates a memory free hint for a PCM memory controller of the PCM memory device; and use the memory free hint to determine whether to save the freed data to the PCM memory device.
[0002]
2. Method according to claim 1, characterized in that it further comprises: saving the released data in the PCM memory device according to the memory release hint.
[0003]
3. Method according to claim 1, characterized in that the PCM memory device comprises a memory device of the type "Phase Change Memory and Switch (PCMS).
[0004]
4. Method according to claim 1, characterized in that if the release hints are not enabled for the specified address range, then save the unloaded data in the PCM memory device.
[0005]
5. Method according to claim 1, characterized in that the operation to determine if the memory release hints are enabled for a specified system address range comprises the reading of an enabled/disabled bit, stored in a register of memory ranges, the enable/disable bit having a first value if memory free hints are enabled, and a second value if memory free hints are disabled.
[0006]
6. Method according to claim 1, characterized in that it further comprises: the use of an address indirection table (AIT) to identify specific PCM memory blocks corresponding to the specified system address range.
[0007]
7. Method according to claim 1, characterized in that it further comprises: specifying a memory channel for the system address range.
[0008]
8. System characterized in that it comprises: a processor having a data cache, from which data is downloaded, data associated with a given range of system addresses; and a PCM memory controller for managing access to data stored in a PCM memory device corresponding to a given range of system addresses; the processor determining whether memory free hints are enabled for the specified system address range, in which if memory free hints are enabled for the system address range, then the processor sends a memory free hint to a PCM memory controller from the PCM memory device and where the PCM memory controller uses a memory free hint to determine whether downloaded data should be saved to the PCM memory device.
[0009]
9. System according to claim 7, characterized in that the PCM memory device is a PCMS memory device.
[0010]
10. System according to claim 7, characterized in that it further comprises: read and write temporary storages within the PCMS memory controller to store in temporary storage the data to be stored in accordance with the memory release tips.
[0011]
11. System according to claim 8, characterized in that if the release hints are not enabled for the specified address range, then the PCMS memory controller saves the downloaded data in the PCM memory device.
[0012]
12. System according to claim 8, characterized in that it comprises: a register of memory ranges, including an enable/disable bit to indicate whether memory free hints are enabled for a range of system addresses specified, the on/off bit having a first value if memory free hints are enabled, and a second value if memory free hints are disabled.
[0013]
13. System according to claim 12, characterized in that it further comprises: an address indirection table (AIT) to identify specific PCM memory blocks corresponding to the specified system address range.
[0014]
14. System according to claim 13, characterized in that it further comprises: a DIMM memory channel associated with the range of system addresses.
[0015]
15. A system characterized in that it comprises: a processor having a cache, from which data is downloaded, the data associated with a given range of system addresses; and a PCM memory controller for managing access to data stored in a PCM memory device corresponding to a given range of system addresses; where the processor sends a memory free hint to a PCM memory controller of the PCM memory device, and where the PCM memory controller uses the memory free hint to determine whether to save downloaded data to the PCM memory device based on the range of addresses or pages specified by the MFLUSH statement.
[0016]
16. System according to claim 15, characterized in that the PCM memory device is a PCMS memory device.
[0017]
17. System according to claim 15, characterized in that it further comprises: read and write temporary storages within the PCMS memory controller to store in temporary storage the data to be stored in accordance with the memory release tips.
[0018]
18. System according to claim 15, characterized in that it further comprises: a DIMM memory channel associated with the system address range.
[0019]
19. System according to claim 15, characterized in that a FENCE instruction causes the specified range of memory addresses or pages to be unloaded from the cache and send the memory free tip to the memory controller PCMS.
[0020]
20. Method for using memory release hints within a computer system, characterized by the fact that it comprises: issuing one or more persistent memory writes; issuing a memory barrier instruction with one or more writes; determining whether a write reaches an address covered by a memory controller interval register; if so, then output a write to a PCM controller and wait for confirmation; completes the memory barrier instruction upon receipt of confirmation.
[0021]
21. Method according to claim 20, characterized in that it further comprises: continuing a normal mode of execution, if the address is not covered by an interval record.
[0022]
22. Method according to claim 20, characterized in that the PCM controller is a PCMS controller.
[0023]
23. System characterized in that it comprises: the processor means having a cache from which data is downloaded, data associated with a given range of system addresses; and a PCM memory controller means managing access to data stored in a PCM memory device means corresponding to a given range of system addresses; the processor half by determining whether memory free hints are enabled for the specified system address range, in which if memory free hints are enabled for the system address range, then the processor half sends a free hint memory controller for a PCM memory controller from the PCM memory device medium and where the PCM memory controller medium uses a memory free hint to determine whether the downloaded data should be saved to the PCM memory device.
[0024]
24. System according to claim 23, characterized in that the PCM memory device means is a PCMS memory device.
[0025]
25. System according to claim 23, characterized in that it further comprises: read and write temporary storages within the PCMS memory controller to store in temporary storage the data to be stored in accordance with the memory release tips.
[0026]
26. System according to claim 25, characterized in that if the release hints are not enabled for the specified address range, then the PCMS memory controller saves the downloaded data in the PCM memory device.
[0027]
27. System according to claim 26, characterized in that it comprises: a register of memory ranges, including an enable/disable bit to indicate whether memory free hints are enabled for a range of system addresses specified, the on/off bit having a first value if memory free hints are enabled, and a second value if memory free hints are disabled.
[0028]
28. System according to claim 27, characterized in that it further comprises: an address indirection table (AIT) means for identifying specific PCM memory blocks corresponding to the specified system address range.
[0029]
29. System according to claim 28, characterized in that it further comprises: a DIMM memory channel means associated with the system address range.

类似技术:

公开号 | 公开日 | 专利标题

US10719443B2|2020-07-21|Apparatus and method for implementing a multi-level memory hierarchy

US11132298B2|2021-09-28|Apparatus and method for implementing a multi-level memory hierarchy having different operating modes

BR112014015051B1|2021-05-25|method and system for using memory free hints within a computer system

US11200176B2|2021-12-14|Dynamic partial power down of memory-side cache in a 2-level memory hierarchy

US9317429B2|2016-04-19|Apparatus and method for implementing a multi-level memory hierarchy over common memory channels

US9817758B2|2017-11-14|Instructions to mark beginning and end of non transactional code region requiring write back to persistent storage

US9286205B2|2016-03-15|Apparatus and method for phase change memory drift management

US20140229659A1|2014-08-14|Thin translation for system access of non volatile semicondcutor storage as random access memory

US20140204663A1|2014-07-24|Efficient pcms refresh mechanism

同族专利:

公开号 | 公开日

WO2013095437A1|2013-06-27|

GB201411389D0|2014-08-13|

CN104115129B|2017-09-08|

CN104115129A|2014-10-22|

KR20140098220A|2014-08-07|

US9269438B2|2016-02-23|

GB2514023A|2014-11-12|

GB2514023B|2019-07-03|

KR101636634B1|2016-07-05|

DE112011106013T5|2014-10-02|

US20140297919A1|2014-10-02|

BR112014015051A2|2017-06-13|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US20050268022A1|2004-05-26|2005-12-01|Pelley Perry H|Cache line memory and method therefor|

CN1787015A|2004-12-07|2006-06-14|曾蒙汉|Apparatus for recording traffic accident|

US20060143397A1|2004-12-29|2006-06-29|O'bleness R F|Dirty line hint array for cache flushing|

US7752173B1|2005-12-16|2010-07-06|Network Appliance, Inc.|Method and apparatus for improving data processing system performance by reducing wasted disk writes|

US8285940B2|2008-02-29|2012-10-09|Cadence Design Systems, Inc.|Method and apparatus for high speed cache flushing in a non-volatile memory|

US8195887B2|2009-01-21|2012-06-05|Globalfoundries Inc.|Processor power management and method|

KR20110048304A|2009-11-02|2011-05-11|삼성전자주식회사|Method for prevention of losing code data in solder reflow and devices using same|CN104126181A|2011-12-30|2014-10-29|英特尔公司|Thin translation for system access of non volatile semicondcutor storage as random access memory|

US9244839B2|2013-07-26|2016-01-26|Intel Corporation|Methods and apparatus for supporting persistent memory|

US10140212B2|2013-09-30|2018-11-27|Vmware, Inc.|Consistent and efficient mirroring of nonvolatile memory state in virtualized environments by remote mirroring memory addresses of nonvolatile memory to which cached lines of the nonvolatile memory have been flushed|

US10223026B2|2013-09-30|2019-03-05|Vmware, Inc.|Consistent and efficient mirroring of nonvolatile memory state in virtualized environments where dirty bit of page table entries in non-volatile memory are not cleared until pages in non-volatile memory are remotely mirrored|

EP3053040B1|2013-09-30|2017-07-26|VMware, Inc.|Consistent and efficient mirroring of nonvolatile memory state in virtualized environments|

JP6131170B2|2013-10-29|2017-05-17|株式会社日立製作所|Computer system and data arrangement control method|

GB2524063B|2014-03-13|2020-07-01|Advanced Risc Mach Ltd|Data processing apparatus for executing an access instruction for N threads|

WO2015167435A1|2014-04-28|2015-11-05|Hewlett-Packard Development Company, L.P.|Cache management|

CN105354010B|2014-10-20|2018-10-30|威盛电子股份有限公司|Processor and the method that hardware data is executed by processor|

US10514920B2|2014-10-20|2019-12-24|Via Technologies, Inc.|Dynamically updating hardware prefetch trait to exclusive or shared at program detection|

US10049052B2|2014-10-27|2018-08-14|Nxp Usa, Inc.|Device having a cache memory|

US9767041B2|2015-05-26|2017-09-19|Intel Corporation|Managing sectored cache|

US20160350534A1|2015-05-29|2016-12-01|Intel Corporation|System, apparatus and method for controlling multiple trusted execution environments in a system|

US10152413B2|2015-06-08|2018-12-11|Samsung Electronics Co. Ltd.|Nonvolatile memory module and operation method thereof|

US10268382B2|2015-06-18|2019-04-23|Mediatek Inc.|Processor memory architecture|

US10423330B2|2015-07-29|2019-09-24|International Business Machines Corporation|Data collection in a multi-threaded processor|

KR102312399B1|2015-09-07|2021-10-15|에스케이하이닉스 주식회사|Memory system and operating method thereof|

US20170123796A1|2015-10-29|2017-05-04|Intel Corporation|Instruction and logic to prefetch information from a persistent memory|

US9824419B2|2015-11-20|2017-11-21|International Business Machines Corporation|Automatically enabling a read-only cache in a language in which two arrays in two different variables may alias each other|

US10303372B2|2015-12-01|2019-05-28|Samsung Electronics Co., Ltd.|Nonvolatile memory device and operation method thereof|

US10210088B2|2015-12-28|2019-02-19|Nxp Usa, Inc.|Computing system with a cache invalidation unit, a cache invalidation unit and a method of operating a cache invalidation unit in a computing system|

US10025714B2|2016-09-30|2018-07-17|Super Micro Computer, Inc.|Memory type range register with write-back cache strategy for NVDIMM memory locations|

KR102208058B1|2016-11-04|2021-01-27|삼성전자주식회사|Storage device and data processing system including the same|

US10649896B2|2016-11-04|2020-05-12|Samsung Electronics Co., Ltd.|Storage device and data processing system including the same|

CN108241484B|2016-12-26|2021-10-15|上海寒武纪信息科技有限公司|Neural network computing device and method based on high-bandwidth memory|

US10229065B2|2016-12-31|2019-03-12|Intel Corporation|Unified hardware and software two-level memory|

US9933963B1|2017-03-01|2018-04-03|Seagate Technology|Open block handling to reduce write errors|

US10198354B2|2017-03-21|2019-02-05|Intel Corporation|Apparatus, system, and method to flush modified data from a volatile memory to a persistent second memory|

US10664406B2|2017-03-21|2020-05-26|International Business Machines Corporation|Coordinated utilization of parallel paths to improve efficiency|

US20190019568A1|2017-07-12|2019-01-17|Nanya Technology Corporation|Fuse-blowing system and method for operating the same|

US10678475B2|2017-07-27|2020-06-09|Hewlett Packard Enterprise Development Lp|Tracking write requests to media controllers|

US20190042445A1|2017-08-07|2019-02-07|Intel Corporation|Technologies for caching persistent two-level memory data|

US11016669B2|2018-05-01|2021-05-25|Qualcomm Incorporated|Persistent write data for energy-backed memory|

KR102057518B1|2018-05-23|2019-12-19|단국대학교 산학협력단|Apparatus and Method for Controlling Requests Unaligned to Memory Burst|

CN108959526B|2018-06-28|2021-10-15|郑州云海信息技术有限公司|Log management method and log management device|

US20200192721A1|2018-12-18|2020-06-18|Ati Technologies Ulc|Configurable peripherals|

法律状态:
2018-12-18| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|

2019-10-08| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|

2021-05-04| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|

2021-05-25| B16A| Patent or certificate of addition of invention granted|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 21/12/2011, OBSERVADAS AS CONDICOES LEGAIS. |

优先权:

申请号 | 申请日 | 专利标题

PCT/US2011/066492|WO2013095437A1|2011-12-21|2011-12-21|System and method for intelligently flushing data from a processor into a memory subsystem|

[返回顶部]