![]()
专利摘要:
Methods, apparatus and software for implementing RMA APIs by Active Message (AM). AM write and AM read requests are sent by a local node to a remote node to write data or read data to a memory on the remote node using remote memory access (RMA) techniques. AM requests are processed by corresponding AM managers that automatically perform operations associated with the requests. For example, for AM write requests, an AM write request handler can write data contained in an AM write request to a remote address space in the remote node's memory, or generate a request corresponding RMA handwriting which is put in an RMA queue used in accordance with a marked message system. Similar operations are performed by AM read request handlers. RMA read and write operations using AMs are further facilitated by the use of associated read, write and RMA progress modules. 公开号:FR3025331A1 申请号:FR1557115 申请日:2015-07-24 公开日:2016-03-04 发明作者:Jianxin Xiong;Robert J Woodruff;Frank L Berry 申请人:Intel Corp; IPC主号:
专利说明:
[0001] RMA API MEDIA BY ACTIVE MESSAGE CONTEXT In a computer system with a distributed memory configuration (for example, a cluster), each compute node has direct access to its own attached local memory. Memory attached to other nodes is called remote memory. Usually, the remote memory is not directly accessible and message transmission mechanisms are used to communicate between the nodes. [0002] A Remote Memory Access (RMA) interface is a software interface that gives the impression that the remote memory is directly accessible. There are many forms of RMA operations, but in the end, they can be summarized in two operations: reading and writing. The read operation copies data from a range of remote memory addresses to a local buffer. The write operation copies data from a local buffer into a range of remote memory addresses. Existing Remote Memory Access (RMA) operations generally fall into two categories. Operations corresponding to the first category make use of the Remote Direct Memory Access (RDMA) capability of the underlying interconnect matrices between the computer nodes. InfiniBand (IB) Host Controller Adapters (HCAs) such as these from Mellanox support RDMA functions in hardware. Capacity is exposed to applications through a software interface called IB Verbs. With IB Verbs, to perform an RMA operation, the user creates a job request and posts it to a job queue. The HCA then processes the queue and performs the RDMA in the hardware. The approach used by the second category is to emulate RMA functions by a normal message passing interface (MPI). An example is the realization of unilateral operations in MPICH / MPICH2 (MPI high performance). This is often done through a request / response process. Due to the asynchronous nature of the RMA operations, a separate thread is usually used to handle requests. Existing RMA operations have many disadvantages. For RDMA operations, specialized hardware is required. For message transmission operations, the additional traffic relating to the associated messages must be transferred to the interconnect matrix, thereby reducing the effective bandwidth of the matrix. BRIEF DESCRIPTION OF THE DRAWINGS The above-mentioned aspects and many of the associated advantages of this invention will be further appreciated from the following detailed description, in conjunction with the accompanying drawings, in which, unless otherwise indicated, like reference numerals refer to identical elements in the different figures. Figure la is a diagram illustrating an RMA write operation in which a data block is written to the address space of a remote node through an AM write request handler, in accordance with a embodiment; Fig. 1b is a diagram illustrating an RMA read operation in which a data block is read in the address space of a remote node through an AM read request handler, according to an embodiment ; Fig. 2a is a diagram illustrating an RMA write operation in which data is transferred through a plurality of packets using the scheme shown in Fig. La, according to one embodiment, Fig. 2b is a diagram illustrating an RMA read operation in which data is transferred through a plurality of packets using the scheme shown in Fig. 1b, according to one embodiment; Fig. 3a is a diagram illustrating an RMA message write operation; marked in which a block of data is written to the address space of a remote node using an AM request followed by a sending operation, according to one embodiment; Fig. 3b is a diagram illustrating a read message RMA read operation in which a data block is read in the address space of a remote node 3025331 using a read request handler. AM, an RMA progress module and an RMA reader module, according to one embodiment; Fig. 4 is a timing diagram illustrating a message exchange between a local node and a remote node which corresponds to an access code exchange and a storage operation in memory; and Figure 5 is a diagram illustrating a node architecture configured to facilitate aspects of local and remote node operations according to embodiments disclosed herein. [0003] DETAILED DESCRIPTION Embodiments of methods, apparatus, and software for implementing an RMA (API) application programming interface (API) are described herein. In the following description, many specific details are described to provide a thorough understanding of the embodiments set forth and illustrated herein. Those skilled in the art will understand, however, that the invention can be implemented without one or more of these specific details, or with other methods, components, materials, etc. In other cases, well-known structures, materials or operations are not shown or described in detail to avoid obscuring certain aspects of the invention. For the sake of clarity, some of the components shown in the Figures may also be referred to by their designation in the figures rather than by a reference number. In addition, reference numbers designating a particular type of component (and not a particular component) may be followed by "(type)". It will be understood that the configuration of these components is the typical configuration for similar components that may exist but are not shown in the Figures for the sake of simplicity and clarity, or other similar components that are not designated by separate reference. Conversely, the term "(type)" should not be interpreted to mean that the component, the element, and so on. is usually used for its function, its implementation, its purpose, etc., exposed. [0004] Among the aspects of the embodiments set forth herein, active message techniques are used to facilitate RMA write and read operations in which data is written to or read from the memory of a remote node using of the RMA. The basic idea of AM is to allow a code 5 to be ( ) Automatically executed on the target side when a message arrives. This code is called AM Manager. Several AM handlers may be prerecorded with the AM mechanism and an AM message may refer to any of them by specifying an identifier corresponding to the desired AM handler. An AM message may be a request or an answer, both of which may carry data plus some additional control information and may cause execution of the specified handler. An AM request may be issued anywhere except within an AM manager, whereas an AM response may be issued only within an AM manager, and a maximum response may be issued for each AM request. Of the aspects of embodiments disclosed herein, active message handlers are executed automatically when a corresponding request or AM response reaches its target. This provides the necessary asynchronous processing mechanism for RMA operations. Basically, an RMA write operation may include an AM request that carries the source buffer data and an AM manager that copies the data to the target address. An RMA read operation may include an AM request that carries the target address information, an AM manager that returns the data via an AM response, and another AM manager that copies the data to the destination buffer. Figures 1a and 1b illustrate AM write and read mechanisms for transferring blocks of data between a local node 100 and a remote node 102, according to one embodiment. As shown in Figure la, the local node 100 comprises a source buffer (src_buf) 104, an RMA write module 30 (rma_write ()) 106, and an AM write response handler 108. The remote node 102 includes a AM 110 write response manager and a RMA address space 112. The local node 100 is in communication with the remote node 102 via an interconnection 114. The RMA write of a source data block 116 a source buffer 104 on an RMA address space 112 is as follows. The source data 116 is read into the source ramp 104 via an RMA write module 106, which generates an AM write request 118 which is sent by the interconnect 114 to the AM manager 110. The AM write request 118 includes the source data 116 as well as information on the remote memory range on which the source data 116 is to be written. This includes the start address (rma addr) in the RMA address space 112 at which the start of the data 116 is to be written, and may optionally include the size (length), an access code. and / or a message marker. The target side AM write request handler (AM 110 write request handler) then copies the source data 116 into the RMA address space 112 starting at the rma addr address by a copy operation. DMA or memory 120. In one embodiment, when the AM write request handler 110 has completed copying the source data 116 into the RMA address space 112, an optional AM 122 write response may be returned. to the initiator (the local node 100) if a completion event is desired. As shown in Figure, the AM write response 122 is sent to the AM write response handler 108 which is configured to process the completion event. Figure 1b further shows resources and AMs used by the local node 100 and the remote node 102 in connection with an RMA read operation. These include a destination buffer (dest_buf) 124, an RMA read module (rma_read ()) 126, an AM read response handler 128 for the local node 100, and an AM 130 read request handler for the remote node 102. In one embodiment, the AM write response handler 108 and the AM read response handler 128 correspond to the same AM handler. Also, in one embodiment, the AM write request handler 110 and the AM read request handler 130 correspond to the same AM handler. [0005] 3025331 6 The execution of an RMA read operation is as follows. An RMA read API 126 sends to an AM read request handler 130 an AM read request 132 identifying a remote address range starting with rma addr at which the data to be read (eg, the remote data 134) is 5 stored. In one embodiment, the AM read request 132 includes the start address (rma addr) and the length, and may include an access code and / or a message marker. In response to receiving the AM read request 132, the AM read request handler 130 sends a copy of the remote data 134 from the remote node 102 to the local node 100 via an AM read response 136 which is sent to the handler. AM read responses 128. On receiving the AM read response 136, the AM read response handler 128 extracts the copy of the remote data 134 and makes a DMA or memory copy 138 to write the copy of the remote data 134 into the destination buffer 124, thereby completing the read operation RMA. [0006] In general, the AM mechanisms may have limitations as to the amount of data that can be transferred by a single AM message. To overcome these limitations, a data amount greater than the limit for a single AM message can be divided into smaller data units (eg in packets) and transferred in pipeline mode. Examples of RMA and RMA read operations causing data transfer between a local node 200 and a remote node 202 through an interconnect 204 using this technique are illustrated in Figures 2a and 2b. As shown in Figure 2a, the local node 200 includes a source buffer 206 with multiple slots 208, an RMA write module 210, and an AM 212 write response handler. The remote node 202 includes a request handler AM 214 write and a memory space RMA 216 divided into several slots 218. The RMA write in pipeline mode is as follows. Data stored in different slots of the source buffer 206 are reached using the RMA writer module 210 and are transferred as a plurality of packets 220 using the write requests AM 222. As indicated above, in one embodiment, each AM 222 write request includes the start address (rma addr) and optionally the length, an access code and / or a message marker. Upon receipt by the AM manager 214, each AM 222 request is processed, causing the generation of memory copies 224 which write data to the corresponding address ranges starting with the address space rma addr 216. In Embodiments using completion events, one or more AM write responses 226 are returned to the AM manager 212. For example, an AM write response may be returned for each completed packet write, or a response of AM writing can confirm the completion of multiple packet writes, for example by generating an AM write response 226 for a given source data transfer regardless of the number of packets sent. For purposes of illustration, the source data portion of each packet 220 is represented as being stored in a corresponding slot 208 in the source buffer 206 and written in a corresponding slot 218 in the RMA address space 216; however, it will be understood that the source buffer 206 and / or the RMA address space 216 need not be divided into a multitude of slots, and can generally be configured as one or more address spaces. in which data can be stored. In addition, the size of each packet 220 may be the same or different sizes may be used. For example, in one embodiment, the packets use a maximum transfer unit (MTU) applicable to the underlying transport protocol used by the interconnect 204 (note that the last packet may have a size 25 less than the MTU). At the same time, the data contained in the packets must be written in the RMA address space 216 so as to reproduce the source data block to be transferred with the multiple packets. In one embodiment, the data of the packets are written in order, while in other embodiments the messy writes are allowed insofar as, once the transfer is complete, the data block Written includes a reproduction of the source data block. [0007] As shown in Figure 2b, the local node 200 further comprises a destination buffer 228 having a plurality of slots 230, an RMA reader module 232 and an AM read response manager 234. The remote node 202 further comprises an AM read request handler 236. As with the multi-packet MA write, the data is transferred using the RMA read module 232 with multiple packets. Each packet transfer is similar to the data block transfer for the RMA read of Figure 1b, and proceeds as follows. An RMA reader module 232 sends an AM read request 240 to be processed by a read request handler AM 236. Each AM read request 240 identifies a remote address range beginning with rma_addr on which the remote data at read are stored. As noted above, in one embodiment, the AM read request 240 includes the start address (rma addr), the length, and an optional access code and / or message marker. [0008] In response to receiving each AM read request 240, the AM read request handler 236 sends a copy of a portion of the remote data including a packet 241 from the remote node 202 to the local node 200 using an AM 242 read response that is addressed to the AM read response handler 234. Upon receiving each AM read response 242, the AM read response handler 234 extracts the copy of the remote data in the packet 241 and performs a DMA copy or memory 244 for writing the copy of the remote data transferred by the packet 241 into a corresponding slot 230 in the destination buffer 228, thereby completing the RMA read operation of the packet. As with the multi-packet read operation shown in Fig. 2a and described above, the use of slots for the destination buffer 228 and the RMA address space 216 is shown for illustrative purposes only. packets that may vary in some embodiments. Packet size can be a limiting factor in the maximum bandwidth possible with pipeline mode execution. However, performance can be increased for large data transfers by using a marked message transmission mechanism if such a mechanism is available. The transmission of the marked messages is similar to the transmission of the normal messages in that they are both done by sending on one side and receiving on the other side. However, the transmission of tagged messages attaches a tag to each message so that the receiver can select the message to be received. This in fact turns a single communication channel into a multitude of channels. In one embodiment, the marker is used to identify different RMA operations to ensure one-to-one correspondence between matched transmit and receive operations. [0009] Examples of read and write operations using tagged messages are shown in Figures 3a and 3b. As shown by the marked message writing operation of Figure 3a, a source data block is transferred from a local node 300 to a remote node 302 via an interconnection 304. More precisely, the local node 300 comprises a source buffer 306, in which the source data is stored, and an RMA write module 308, while the remote node 302 comprises an AM write request handler 310, an RMA progress module 312, and a space d 314 RMA addresses. The operation of writing marked messages is as follows. The RMA write operation includes an AM write request 316 followed by a send operation 318. The AM write request 316 only carries information about the remote address range, without a payload of data. ; instead, the sending data 320 is transferred with the next sending operation 318. Upon receipt of the AM write request 316, the AM manager 310 generates a corresponding RAM write request and sets the request for RMA write, by a queuing operation 322, to an RMA queue 324. When the queued RMA write request is then processed by the RMA progress module 312 after a queue removal operation 326, a receive operation 328 is transmitted with the remote address range provided by the AM write request 316 as the receive buffer 30 in the RMA address space 314. This The receive operation must coincide with the send 320 posted on the initiator side (that is, the local node 300) and place the source data in the intended remote address range in the RMA address space 314. In such a way, general, the sending of applications for re AM and the corresponding data sends can be asynchronous, although the AM write request should preferably precede the sending of the associated data. In one embodiment, the receive operation 328 can temporarily store the send data blocks 320 before they are written to the RMA address space 314. As shown in Figure 3b, the local node 300 further comprises a destination buffer 330 and an RMA reader module 332 which are configured to facilitate a read message operation marked as follows. As with the marked message writing operation, an RMA read consists of a receive operation followed by an AM read request and the AM read request handler causes a target side send operation to be sent. . The target side queue is necessary because the message transmission operations can not normally be transmitted within an AM manager. As shown in Figure 3b, the RMA reader module 332 issues an AM read request 334 that is addressed to and received by an AM read request handler 336 on the remote node 302. The read request handler AM 336 generates a corresponding RMA read request and puts it in an RMA read request queue 338, while a send block 340 in the RMA progress module 312 outputs the read requests from the queue RMA read requests 338, retrieve the corresponding data from the RMA address space 314 and send the data by a send operation 342 to a receive operation 344 in the RMA read module 332. The receipt 344 then writes the data to the destination buffer 330, completing the remote read operation. Depending on the approach used, the memory address space or spaces used for the RMA write and read operations according to the above-mentioned embodiments may or may not require advance registration. For example, some embodiments use PSM (Performance Scaled 3025331 11 Messaging) that does not require memory storage. PSM defines an API designed specifically for HPC. It defines a marked messaging API that manages high-level capabilities and can effectively support the implementation of the Message Passing Interface (MPI) standard. In the meantime, the internal implementation of PSM may relate to specific interconnection details relating to data movement strategies and debugging functions and advanced functions such as Quality of Service (QoS), dispersive routine, resilience, etc. The active message function is also provided on an experimental basis. [0010] PSM is designed to be implemented as a user space library. The details for performing RMA data transfers using PSM are given in versions of the PSM Programming Manual published by QLogic, the developer of PSM, or in various PSM documents published by the OpenFabrics Alliance. PSM is included in OFED (OpenFabrics 15 Enterprise Distribution) as of version 1.5.2, and is equivalent to IB Verbs. Although intended for use in InfiniBand, in embodiments using PSM here, similar functionality to PSM may be used for different InfiniBand hardware such as Ethernet Network adapters. As noted above, PSM does not require recording in memory. Optionally, a lightweight memory storage mechanism may be used to provide access validation. The implementation considerations may include the control structure of the memory region to be accessed, such as a contiguous address space or a list of separate address ranges. In one embodiment, the address of the control structure may be used as the access code. Figure 4 shows an operation and messaging timeline executed by a local node 400 and a remote node 402 connected by a network link 404. Although it is described as a direct connection, it will be understood that the network link 404 may traverse one or more additional network elements such as a switch or the like. The timing diagram starts with the local node 400 registering one or more address ranges 406 in one or more address spaces in the local memory of the local node 400. The local node 400 then publishes localization codes. 408 via code publication messages 410 which are sent to one or more remote nodes such as the remote node 402. In one embodiment, the access code 408 is encoded to identify the range (s). ) of registered addresses / the space or spaces saved by the local node 400. Next, the access code 408 is used for the validation of RMA and RMA read requests issued by the remote node 402 to access the memory in the address range (s) / space or spaces recorded by the local node 400. As shown in the lower part of FIG. message 412 corresponding to an AM write request or an AM read request is sent p ar the remote node 402 to the local node 400. The message 412 includes the access code 408. On reception, the AM write or the AM read is validated by the local node 400 using the access code 408 This validation operation also allows the address range specified by the start address and the explicit or detected size of the request to be checked to verify that it is the address range (s). / of the space or spaces registered (s). If the validation is successful, access to the registered memory is allowed; otherwise, he is not. In one embodiment, an access error message 416 is returned to the remote node 402 if its AM write or AM read request fails validation. Figure 5 is a diagram of an example of an apparatus configured to be installed as a local node 500 that can be used to implement aspects of the embodiments set forth herein. (It will be understood that the notion of "local" or "remote" node is from the point of view of the node, and that the operations and functionalities described herein can be implemented by a single node operating in local or remote mode in the context RMA write or read operations.) In one embodiment, a node 500 includes a server blade or a server module configured to be installed in a server chassis. The server blade / module includes a motherboard 502 on which various components are mounted, including a processor 504, a memory 506, a storage unit 508, and a network interface 510. The motherboard 502 includes a computer 3025331. typically one or more connectors to receive power from the server chassis and to communicate with other components in the chassis. For example, a blade server or ordinary module architecture uses a backplane or other similar device with multiple connectors, in which corresponding blade connectors or corresponding server modules are installed. The processor 504 includes a CPU 512 having one or more cores. The CPU and / or cores are coupled to an interconnect 514 which is representative of one or more interconnects implanted in the processor (for simplicity, only one interconnect is shown). The interconnection 514 is also coupled to a memory interface (I / F) 516 and a PCIe interface (Peripheral Component Interconnect Express) 518. The memory interface 516 is coupled to the memory 506, while the PCIe interface 518 in communication with processor 504 with various input / output devices (I / O), including storage unit 508 and network interface 510. In general, storage unit 508 is representative of one or more nonvolatile storage devices such as, without limitation, a magnetic or optical disk, a chip disk (SSD), a chip or a flash memory module, etc. The network interface 510 is representative of various types of network interfaces that may be implemented in a server node, such as an Ethernet network adapter or a NIC. The network interface 510 comprises a PCIe interface 520, a Direct Memory Access (DMA) engine 522, a transmission buffer 524, a reception buffer 526, a MAC module 530 and a motor The network interface 510 further comprises a PHY circuit 534 including the circuitry and logic for implanting an Ethernet physical layer. An optional reconciliation layer 536 is also shown. The PHY circuit 534 comprises a set of PHY sublayers 538a-d, a serializer / deserializer (SERDES) 540, a transmit port 542 with a transmit buffer 544 and one or more transmitters 546, and a receiving port 548 with a receiving buffer 550 and one or more receivers 552. The node 500 is further shown to be in communication with a remote node 554 having a receiving port 556 and a transmitting port 558 via a wired or optical link 560. Depending on the particular PHY Ethernet PHY being implanted, different combinations of PHY sublayers may be used, as well as different transmitter and receiver configurations. For example, a PHY of GE uses a PHY circuitry different from a PHY d 40 GE or 100 GE. Various software components are run on one or more cores of the CPU 512 to implement software aspects of the embodiments as described above with reference to Figures 1a, 1b, 2a, 2b, 3a and 3b. . The exemplary software components shown in Figure 5 include a host operating system 562, applications 564, and software instructions for implementing various AM 566 drivers and RMA 568 modules. All or a portion of the software components are typically stored on the server. as in the case of the storage unit 508. Furthermore, in some embodiments, one or more of the components may be downloaded via a network and stored in the memory 506 and / or the storage unit 508. During The operation of the node 500, portions of the host operating system 562 will be loaded into the memory 506, with one or more applications 564 that are executed in the operating system user space (Operation System - OS). AM 566 drivers and RMA 558 modules can usually be installed with OS or other drivers, or can be installed as a software component running in the OS user space. In some embodiments, all or part of the AM 566 and / or RMA 558 may be installed as integrated software that is run on one or more processing elements located in the network interface 510 such that packet processing engine / NPU 532. In another option, all or part of the operations of AM managers and / or RMA modules can be implemented via one or more virtual machines hosted on node 502 (not shown). . In the embodiment illustrated in Figure 5, the MAC module 530 is shown as part of the network interface 510 which includes a hardware component. The logic for implementing various operations supported by the network interface 510 may be installed via integrated logic and / or integrated software running on the packet processing engine / NPU 532 or one or more other processing elements. . For example, built-in logic can be used to prepare higher layer packets to transfer them outward from transmit port 542. This includes encapsulation of higher layer packets (eg, TCP / IP , UDP, other protocols, etc.) in ethernet packets, then etching Ethernet packets, ethernet packets being used to generate an Ethernet frame stream. In general, the packet processing operations performed by the packet processing engine / NPU 532 may be implemented by integrated logic and / or integrated software. The packet processing is implemented to manage the sending of data within the network interface 510 and between the network interface 510 and the memory 506. This includes the use of the DMA engine 522 which is configured to send data from the receive buffer 526 to the memory 506 using DMA write operations, so that the data is sent via the PCIe interfaces 520 and 518 to the memory 506 in a manner that does not involve the CPU 512. In some embodiments, the transmit buffer 524 and the receive buffer 526 include an I / O address space "mapped" into memory (Memory-Mapped IO - MMIO) which is configured to facilitate DMA data transfers between these buffers and the memory 506 using techniques well known in the networking art. [0011] None, all or part of the MAC layer operations can be implemented in software running on the host processor 504. In an embodiment using a split MAC architecture, the encapsulation and decapping operations of the Ethernet packets are performed. in software, while Ethernet dithering and dithering are implemented by the hardware (eg, via embedded logic or embedded software running on the packet processing engine / NPU 532 on the device). network interface 510). [0012] 3025331 16 RMA write operations and RIVIA systems using AM requests, AM responses, and associated AM managers offer advantages over existing RA techniques. For example, because this system can be implemented via software running on a host, RMA data transfers that previously required specially configured hardware (eg InfiniBand HCAs) can be expanded for use with other protocols such as Ethernet, but without limitation. These systems can be combined with existing techniques, for example using PSM or tagged messaging APIs to further increase their functionality and performance. [0013] Other aspects of the object described herein are presented in the following numbered clauses: 1. A method for performing remote memory access (RMA) data transfers between a remote node and a local node, the method comprising: performing an RMA write in which data is written from the local node to the remote node by reading the data to be written to a source buffer on the local node; by sending a first active message (Active Message AM) request from the local node to an AM manager on the remote node, this first AM write request containing the data to be written and a start address in a space d remote memory addresses on the remote node in which the data is to be written; and processing the first AM write request with the AM manager on the remote node by retrieving the data from the first AM write request and writing that data into a range of addresses in the address space of the AM. remote memory starting at the start address. A method according to clause 1, further comprising: sending an AM write response from the remote node to the local node, which AM write response indicates that the data has been successfully written to the space 30 addresses of the remote memory; and using an AM manager on the local node to process the AM write response message. A method according to clause 1 or 2, further comprising: dividing the data to be written into a plurality of packets; and for each of the plurality of packets: reading the packet data corresponding to that source buffer packet on the local node; sending a corresponding AM request from the local node to the AM manager on the remote node containing the packet data and a start address in a remote memory address space on the remote node in which the packet data must to be written; and processing the corresponding AM request with the AM manager on the remote node by extracting the packet data from the corresponding AM request and writing this data over a range of addresses in the remote memory address space beginning with the start address, the first AM write request corresponding to an AM write request used to transfer a first data packet of the plurality of packets. A method according to clause 3, further comprising: detecting that all packet data has been successfully written to the remote memory address space; Sending an AM response from the remote node to the local node indicating that the packet data has been successfully written to the remote memory address space; and using an AM manager on the local node to process the AM response message. 5. A method according to any one of the preceding clauses, further comprising: performing an RMA read operation in which the data is read in the remote node and transferred to the local node by sending an AM read request to the AM manager on the remote node, the AM read request identifying a range of addresses in the remote data address space to be read; recovering, in response to the reception of the AM read request message, the data to be read in the remote address space via the AM manager on the remote node and sending the retrieved data to the local node via a message AM read response; and processing the AM read response with an AM manager on the local node by retrieving the data from the AM write response and writing that data to a destination buffer on the local node. The method of any one of the preceding clauses, further comprising: performing an RMA read operation in which the data is read from the remote node and transferred to the local node using a plurality of of 10 packets, the data for each of the plurality of packets being transferred by sending a corresponding AM read request to the AM manager on the remote node, the AM read request identifying a range of addresses in the remote address space packet data to read; recovering, in response to receiving the read request message AM, the packet data to be read in the remote address space via the AM manager on the remote node and sending the retrieved packet data to the local node via an AM read response message; and processing the AM read response with an AM handler on the local node by extracting the packet data from the AM read response and writing the packet data to a destination buffer on the local node. The method of any one of the preceding clauses, further comprising: using a message marked for the first AM write request using a marked message system, and using the manager AM on the remote node to inspect the marked message to verify that the remote node is the intended recipient of the first AM write request. A method according to any one of the preceding clauses, further comprising: recording, on the local node, at least one address range in the remote memory address space on the remote node in which data may be written using one or more AM write requests sent by the remote node. A method according to clause 8, further comprising: publishing an access code to the remote node corresponding to at least one address range registered with the local node; the inclusion of the access code in the first AM write request; and inspecting the access code by the AM manager on the remote node to validate that the first AM write request is allowed. A machine-readable non-transitory medium having first and second instruction sets stored thereon and configured to be executed respectively on a local node and a remote node to implement the method according to any one of the preceding clauses. A method for performing remote memory access (RMA) data transfers between a remote node and a local node, the method comprising: performing an RMA write in which data is written from the local node to the remote node 15 by sending a first AM message request from the local node to an AM manager on the remote node, this first AM write request identifying a range of addresses in a memory address space remote on the remote node where the data is to be written; reading the data to be written to a source buffer on the local node and sending the data to the remote node; and processing the data that is received by the remote node so that the data to be written to the remote memory address space occupies the address range identified in the first AM write request. A method according to clause 11, further comprising: receiving a plurality of AM write requests, each AM write request identifying a range of addresses in the remote memory address space on the remote node on which a data block associated with the AM write request which must then be sent must be written, each AM write request comprising indices identifying the associated data block; Queuing, using the AM manager on the remote node, each AM write request to an RMA write queue on the remote node; Receiving a plurality of data blocks from the local node, each data block being associated with a previously received AM write request and containing indices by which the previously received AM write request can be identified; Removing the AM write requests from the RMA write queue; and processing each AM write request removed from the queue to determine whether the data block associated with the received AM write request is to be written to the remote memory address space. A method according to clause 11 or 12, further comprising performing an RMA read in which data is read in the remote node and transferred to the local node, sending an AM read request to the AM manager on the remote node, the AM read request identifying a range of addresses in the remote address space of the data to be read; By generating, in response to the reception of the AM read request message, via the AM manager on the remote node, a read request RMA corresponding to the read request AM identifying the address range in the space of remote addresses of the data to be read; processing the RMA read request on the remote node so that the data is read in the remote address space as defined by the address range in the RMA read request retrieved from the space of remote addresses and sent to the local node; and writing the retrieved data that is sent from the remote node to the destination buffer on the local node. 14. A method according to clause 13, further comprising: in response to receiving the AM read request message, generating and queuing, via the AM manager on the remote node, a request RMA reader corresponding to the AM read request in a RMA read request queue on the remote node identifying the address range 30 in the remote address space of the data to be read; removing the RMA read request from the read requests queue 3025331 21 RMA; and retrieving data to be read in the remote address space as defined by the address range in the RMA read request and sending the retrieved data to the local node. 15. The method according to clause 14, wherein the queue removal, data retrieval and sending operations are performed by an RMA read function on the remote node which is separated from the AM manager on the node. remote. A method according to any one of clauses 11 to 15, further comprising recording, on the local node, at least one address range in the remote memory address space on the remote node. in which data can be written using one or more AM write requests sent from the remote node. 17. The method of clause 16, further comprising: sending an access code to the remote node corresponding to at least one address range registered with the local node; the inclusion of the access code in the first AM write request; and inspecting the access code using the AM manager on the remote node to validate that the first AM write request is allowed. A machine-readable non-transitory medium having first and second sets of instructions stored thereon and configured to be executed respectively on a local node and a remote node to implement the method according to any one of clauses 11 to 17. An apparatus comprising: a network interface; Memory, including a local memory address space; an Active Message Writing (AM) handler, configured to receive, via the network interface, a first AM write request sent by a remote device on a communication line coupled to the network interface, the first AM write request corresponding to a remote memory access (RMA) write issued by the remote apparatus and containing first data to be written to a start address in the memory address space 3025331 22 local, in the memory in which the data must be written; process the first write request AM by extracting the data from the first write request AM and writing the data in a range of addresses in the remote memory address space starting at the start address; and sending an AM write response to the remote apparatus, indicating that the data has been successfully written to the remote memory address space. Apparatus according to clause 19, further comprising: an RMA write module configured to perform an AM write operation in which a second AM write request corresponding to an RMA write to a memory address space remote in the memory of the remote apparatus is generated and sent to the remote apparatus, the second AM write request containing second data to be written to a start address in the remote memory address space in which the data must be written; and a write response manager AM configured to process an AM write response sent by the remote apparatus after successful writing of the second data to the remote memory address space. Apparatus according to clause 20, further comprising a source buffer, and wherein the RMA write module is further configured to: divide third data to be written into the remote memory address space into a plurality packages; for each of the plurality of packets, reading the packet data corresponding to the packet of the source buffer; and sending a corresponding AM write request to an AM write request handler on the remote device containing the packet data and a corresponding start address in the remote memory address space on the device remote in which the packet data must be written. Apparatus according to any one of clauses 19 to 21, wherein the write request handler AM is further configured to: receive a plurality of AM write requests from the remote device, each of the plurality of AM write requests containing packet data corresponding to remote writing of third data which are divided into a plurality of packets and must be written to a local memory address space at a corresponding start address in the local memory address space; retrieving the packet data for each of the plurality of AM write requests and writing the packet data into the local memory address space starting at the start address identified by this AM write request; detecting that all of the third data has been written to the local memory address space; and send an AM write response to the remote apparatus, indicating that the third data has been successfully written to the local memory address space. Apparatus according to any one of clauses 19 to 22, further comprising: a destination buffer in the local memory; an AM write response manager; and an RMA read module configured to generate and send an AM read request to an AM read request handler on the remote apparatus, the AM read request identifying a range of addresses in the remote address space where are the data of the remote device to be read; Receiving from the remote apparatus, in response to the AM read request, an AM read response containing the read data in the remote address space; and managing the AM read response with the AM read response handler by extracting the read response AM data and writing the data to the destination buffer. Apparatus according to any one of clauses 19 to 23, wherein the RMA reader module is further configured to perform an RMA read in which data is read from the remote apparatus and transferred using a plurality of packets, the data for each of the plurality of packets being read remotely, sending a corresponding AM read request to the AM read request handler on the remote apparatus, the AM read request identifying a range of addresses in the remote address space where the packet data to be read is located; receiving from the remote apparatus, in response to each AM read request, an AM read response containing the packet data read from the remote address space; and processing the AM read response with the AM read response handler by extracting the packet data from the AM read response and writing the packet data to the destination buffer. Apparatus according to any one of clauses 19 to 24, further comprising: an AM read request handler configured to receive an AM read request from the remote apparatus identifying a range of addresses in the space d local addresses containing the data to be read; retrieve the data in the address range identified in the AM read request; and send an AM response to the remote device containing the data that is retrieved. Apparatus comprising: a network interface; memory, including a local memory address space; an Active Message Writing (AM) handler, configured to receive, via the network interface, a plurality of AM write requests sent by a remote device on a communication line coupled to the interface of network, each AM write request corresponding to a remote memory access (RMA) write issued by the remote apparatus and containing data to write the local memory address space and a start address to which a start of the corresponding data must be written; and generating an RMA write request and placing the RMA write request in an RMA write queue; and an RMA progress module configured to receive a plurality of data blocks from the remote apparatus, each data block being associated with a previously received AM write request and containing indices by which the request for previously received AM write 3025331 can be identified; remove the RMA write requests from the RMA write queue; and processing each R1VIA write request removed from the queue to determine whether the received data block associated with the RMA write request is to be written to the local memory address space. Apparatus according to clause 26, further comprising: a source buffer in the memory and an RMA write module configured to send an AM write request to the remote apparatus, the AM write request identifying a range addresses in a remote memory address space in the memory of the remote apparatus in which the data is to be written; read the data that must be written to the source buffer; and send the data to the remote apparatus, the data being sent after the AM write request is sent, and the data being sent with indices configured to be used to correspond to the data that is sent to the request for data. AM writing. Apparatus according to clause 26 or 27, further comprising: a queue of RMA read requests; An AM read request handler configured to receive an AM read request from the remote apparatus identifying a range of addresses in the local address space containing the data to be read; generating, in response to receiving the message of the read request message AM, a corresponding read request RMA identifying the address range 25 in the local address space containing the data to be read; and putting the RMA read request in the RMA read request queue; and an RMA progress module configured to remove the RMA read request from the RMA read request queue; read the data of the local address space as defined by the address range in the RMA read request; and send the data that is read to the remote device. Apparatus according to clause 28, further comprising: a destination buffer occupying a portion of the memory; and an RMA read module configured to send an AM read request to the remote apparatus, the AM read request identifying a range of addresses to be read in a remote address space located in the memory of the remote apparatus ; receiving from the remote apparatus data that has been read in the remote address space; and write the received data to the destination buffer. A machine-readable non-transitory medium having instructions stored thereon and configured to be executed on a node having a network interface and a memory having a local address space, said instructions including: An active message handler (AM) ) write requests configured to, when enabled, receive, via the network interface, a first AM write request sent by a remote device on a communication line coupled to the network interface, the first AM write request corresponding to a remote memory access (RMA) write issued by the remote apparatus and containing first data to be written to a start address in the local memory address space, in the memory in which the data must be written; process the first AM write request by extracting the data from the first AM write request and writing the data in a range of addresses in the remote memory address space starting at the start address; and sending an AM write response to the remote apparatus, indicating that the data has been successfully written to the remote memory address space. 31. Machine-readable non-transitory support according to clause 30, further containing instructions comprising: an RMA write module configured to perform a write operation AM 3025331 27 in which a second write request AM corresponding to a RMA write in a remote memory address space located in the memory of the remote device is generated and sent to the remote device, this second AM write request containing second data to be written to a start address 5 in the remote memory address space in which the data is to be written; and an AM write response manager configured to process an AM write response sent by the remote device after successful writing of the second data to the remote memory address space. 32. Machine-readable non-transitory support according to clause 31, the node further comprising a source buffer, and the RMA writer module being further configured to: divide third data to be written into the address space of remote memory in a plurality of packets; For each of the plurality of packets, reading the packet data corresponding to that packet from the source buffer; and sending a corresponding AM write request to an AM write request handler on the remote device containing the packet data and a corresponding start address in the remote memory address space on the device remote in which the packet data must be written. A machine-readable non-transitory medium according to any one of clauses 30 to 32, the write request handler AM being further configured to: receive a plurality of AM write requests from the remote device, each of the plurality of AM write requests containing packet data corresponding to a remote write of third data which are divided into a plurality of packets and which must be written to a corresponding start address in the address space of local memory; retrieving the packet data for each of the plurality of AM write requests and writing the packet data into the local memory address space starting at the start address identified by this AM write request; detecting that all of the third data has been written to the local memory address space 3025331; and sending an AM write response to the remote apparatus, indicating that the third data has been successfully written to the local memory address space. 34. A machine-readable non-transitory medium according to any one of clauses 30 to 33, the node further comprising a destination buffer in the local memory, and the instructions further comprising: an AM read response manager and a RMA read module configured to, when enabled, generate and send an AM read request to an AM read request handler on the remote apparatus, the AM read request identifying a range of addresses in the the remote address space where the data of the remote device to be read is located; receiving from the remote apparatus, in response to the AM read request, an AM read response containing the read data in the remote address space; and managing the AM read response with the AM read response handler by extracting the read response AM data and writing the data to the destination buffer. 35. A machine-readable non-transitory medium according to any one of clauses 30 to 34, the RMA read module being further configured to perform an RMA read operation in which data is read from the remote apparatus and transferred to using a plurality of packets, the data for each of the plurality of packets being read remotely, sending a corresponding read request AM to the read request handler AM on the remote device, the read request AM identifying a range of addresses in the remote address space in which the packet data is to be read; receiving from the remote apparatus, in response to each AM read request, an AM read response containing the packet data read from the remote address space; and processing the read response AM with the read response manager AM by extracting the packet data from the read response AM and writing the packet data to the destination buffer. 36. A machine-readable non-transitory medium according to any one of clauses 30 to 35, the instructions further comprising: an AM read response handler configured to receive an AM read request from the remote device identifying a range of reads; addresses in the local address space containing the data to be read; retrieve the data in the address range identified in the AM read request; and send an AM response to the remote device containing the data that is retrieved. 37. A machine-readable non-transitory medium having instructions stored thereon and configured to be executed on a node having a network interface and a memory having a local address space, said instructions comprising: A manager per active message (AM ) write requests configured to, when enabled, receive, via the network interface, a plurality of AM write requests sent by a remote device on a communication line coupled to the network interface , each AM write request corresponding to a remote memory access (RMA) write issued by the remote apparatus and containing corresponding data to be written to the local memory address space and a start address to which the beginning of the data must be written; and generating an RMA write request and placing the RMA write request in an RMA write request queue; and an RMA progress module configured to, when enabled, receive a plurality of data blocks from the remote apparatus, each data block being associated with a previously received AM write request and containing indices from which an RMA write request can be identified; remove the RMA write requests from the write request queue 3025331 30 RMA; and processing each RMA write request removed from the queue to determine whether the received data block associated with the RMA write request is to be written to the local memory address space. 38. Machine-readable non-transitory support according to clause 37, the node further comprising a source buffer in the memory, and the instructions further comprising: an RMA write module configured to, when enabled, send a AM write request to the remote apparatus, the AM write request identifying a range of addresses in the remote memory address space located in the memory of the remote apparatus in which the data is to be written ; read the data that must be written to the source buffer; and send the data to the remote apparatus, the data being sent after the AM write request is sent, and the data being sent with indices configured to be used to correspond to the data that is sent to the request for data. AM writing. 39. A machine-readable non-transitory support according to clause 37 or 38, the node further comprising an RMA read request queue, and the instructions further comprising: an AM read request handler configured for, when it is activated, receiving an AM read request from the remote apparatus identifying a range of addresses in the local address space containing data to be read; generating, in response to receiving the read request message AM, a corresponding read RAM request identifying the range of addresses in the local address space containing data to be read; and putting the RMA read request in the RMA read request queue; and an RMA progress module configured to remove the RMA read request from the RMA read request queue; 3025331 31 read data in the local address space as it is defined by the address range in the RMA read request; and send the data that is read to the remote device. 40. Machine-readable non-transitory support according to clause 39, wherein the node further comprises a destination buffer occupying a portion of the memory, and the instructions further comprising: an RMA read module configured for, when enabled send an AM read request to the remote apparatus, the AM read request identifying a range of addresses to be read in the remote address space located in the memory of the remote apparatus; receiving data from the remote device that has been read in the remote address space; and write the received data to the destination buffer. Although certain embodiments have been described by way of particular implementations, other implementations are possible according to some embodiments. In addition, it is not necessary that the arrangement and / or order of the elements or other features illustrated in the drawings and / or described herein be arranged in the particular manner illustrated and described. Numerous other arrangements are possible according to some embodiments. [0014] In each system shown in one figure, the elements may, in some cases, have the same reference number or a different reference number to indicate that the elements represented may be different and / or similar. However, an element may be flexible enough to have different implementations and operate with all or some of the systems shown or described herein. [0015] The various elements shown in the figures may be the same or different. The fact that an element is considered as a first element or a second element is arbitrary. In the description and in the claims, the terms "coupled" and "connected", as well as other derived terms, may be used. It will be understood that these terms are not used as synonyms. Indeed, in particular embodiments, the term "connected" may be used to indicate that two or more elements are in direct physical or electrical contact with each other. The term "coupled" may mean that two or more elements are in direct physical or electrical contact. However, "coupled" may also mean that two or more elements are not in direct contact with each other, but cooperate or interact with each other. An algorithm is considered here, and generally speaking, as a coherent sequence in itself of actions or operations resulting in a desired result. These include physical manipulations of physical quantities. Usually, but not necessarily, these amounts take the form of electrical or magnetic signals that can be stored, transferred, combined, compared, and manipulated in any manner. It has sometimes been practical, mainly for reasons of common use, to call these signals bits, values, elements, symbols, characters, terms, numbers, or others. It will be appreciated, however, that all such terms and similar terms should be associated with the appropriate physical quantities and are merely practical labels applied to these quantities. An embodiment is an implementation or an example of the invention. In the description, the terms "an embodiment", "certain embodiments" or "other embodiments" mean that a particular function, structure or feature described in connection with the embodiments is included in at least some embodiments, but not necessarily all embodiments of the invention. The various occurrences of the terms "an embodiment" or "certain embodiments" do not necessarily all refer to the same embodiments. All components, functions, structures, features, etc. described and illustrated herein need not be included in any particular embodiment or embodiments. If the description indicates that a component, function, structure or feature "may" or "could" be understood, for example, that particular component, function, structure or feature need not be understood. If the description or the claims refer to "an" element, it does not mean that there is only one element. If the description or claims refer to "an additional element", this does not exclude that there is more than one additional element. As noted above, various aspects of the embodiments presented herein may be facilitated by corresponding software and / or integrated components and applications, such as software running on a server or processor or software and / or firmware run by an integrated processor or other. Thus, embodiments of this invention can be used to support or support a software program, software modules, firmware, and / or distributed software running on some form of processing core (such as the CPU). a computer, one or more cores of a multicore processor), a virtual machine running on a processor or a core, or implanted or performed in any manner on or in a computer-readable or machine-readable non-transitory storage medium . A computer-readable or machine-readable non-transitory storage medium includes any mechanism for storing or transmitting information in machine-readable form (eg, a computer). For example, a computer-readable or machine-readable non-transitory storage medium includes any mechanism that presents (i.e stores and / or transmits) information in a form accessible by a computer or a computer. calculating machine (eg, computer device, electronic system, etc.), such as a recordable / non-recordable medium (eg, read-only memory (ROM), random access memory (RAM), a magnetic disk storage medium, an optical storage medium, flash memory devices, etc.). The content 25 can be directly executable ("object" or "executable" form), a source code or a differential code ("delta" or "patch" code). A non-transitory computer readable or machine readable storage medium may also include a storage device or a database from which content may be downloaded. The non-transitory computer readable or machine readable storage medium may also include a device or product having content stored thereon at the time of sale or delivery. Thus, the delivery of a device 3025331 with stored content, or the supply of content to be downloaded by a communication medium, can be understood as the supply of a manufactured article comprising a non-transitory storage medium readable by computer or machine readable with content as described here. [0016] Various components hereinbefore referred to as processes, servers or tools and described herein may be means for performing the described functions. The operations and functions performed by various components described herein may be implemented by software running on a processing element, via integrated hardware or the like, or any combination of hardware and software. These components may be implemented in the form of software modules, hardware modules, special purpose hardware (eg, application-specific hardware, ASICs, DSPs, etc.), integrated controllers, cable circuits, hardware logic, etc. The software content (eg, data, instructions, configuration information, etc.) may be provided by a manufactured article such as a computer-readable or machine-readable non-transitory storage medium that provides content. which represents instructions that can be executed. The content may cause the computer to perform various functions / operations described here. In this document, the expression "at least one of" designating a list of objects means that these objects may be associated in any combination. For example, the expression "at least one of A, B and C" may denote A; B; VS ; A and B; A and C; B and C; or A, B and C. The foregoing description of illustrated embodiments of the invention, including that which is described in the abstract, is not intended to be exhaustive or to limit the invention. to the precise forms exhibited. Although specific embodiments and examples of the invention are described herein for the purpose of illustration, various equivalent modifications are possible within the scope of the invention, as those skilled in the art will appreciate. These modifications can be made to the invention in light of the detailed description above. The terms used in the following claims should not be construed as limiting the invention to the specific embodiments set forth in the description and drawings. On the contrary, the scope of the invention is to be determined in its entirety by the following claims, which must be considered to be in accordance with the established doctrines of the interpretation of the claims.
权利要求:
Claims (5) [0001] REVENDICATIONS1. The claims are the following: 1 ° / Method for performing remote memory access (RMA) data transfers between a remote node and a local node, this method comprising: performing an RMA write wherein data is written from the local node to the remote node by reading the data to be written to a source buffer on the local node; by sending a first active message (Active Message AM) message request from the local node to an AM manager on the remote node, this first AM write request containing the data to be written and a start address in a space of remote memory addresses on the remote node where the data is to be written; and processing the first AM write request with the AM manager on the remote node by retrieving the data from the first AM write request and writing that data into a range of addresses in the memory address space remote starting at the start address. [0002] The method of claim 1, further comprising: sending an AM write response from the remote node to the local node, this AM write response indicating that the data has been successfully written to the space addresses of the remote memory; and using an AM manager on the local node to process the AM write response message. [0003] The method of claim 1, further comprising: dividing the data to be written into a plurality of packets; and for each of the plurality of packets: reading the packetized data corresponding to that source buffer packet on the local node; sending a corresponding AM request from the local node to the AM manager on the remote node containing the packet data and a start address in a remote memory address space on the remote node in which the packet data must to be written; and processing the corresponding AM request with the AM manager on the remote node by extracting the packet data from the corresponding AM request and writing this data over a range of addresses in the remote memory address space beginning with the start address, the first AM write request corresponding to an AM write request used to transfer a first data packet of the plurality of packets. [0004] The method of claim 3, further comprising: detecting that all packet data has been successfully written to the remote memory address space; sending an AM response from the remote node to the local node indicating that the packet data has been successfully written to the remote memory address space; and using an AM manager on the local node to process the AM response message. [0005] The method of claim 1, further comprising: performing an RMA read operation in which the data is read from the remote node and forwarded to the local nodes by sending an AM read request to the AM manager on the remote node, the AM read request identifying a range of addresses in the remote data address space to be read; recovering, in response to the receipt of the AM read request message, the data to be read in the remote address space via the AM manager on the remote node and sending the retrieved data to the local node via a message of AM read response; and processing the AM read response with an AM manager on the local node by retrieving the data from the AM write response and writing that data to a destination buffer on the local node. The method of claim 1, further comprising: performing an RMA read operation in which the data is read from the remote node and transferred to the local node using a plurality of packets , the data for each of the plurality of packets being transferred by sending a corresponding AM read request to the AM manager on the remote node, the AM read request identifying a range of addresses in the remote data address space in package to read; recovering, in response to receiving the AM read request message, the packet data to be read in the remote address space via the AM manager on the remote node and sending the retrieved packet data to the local node via an AM read response message; and processing the AM read response with an AM manager on the local node by extracting the packet data from the AM read response and writing the packet data to a destination buffer on the local node. The method of claim 1, further comprising: using a message marked for the first AM write request using a marked message system, and using the AM manager on the remote node to inspect the marked message to verify that the remote node is the intended recipient of the first AM write request. The method of claim 1, further comprising: recording at the local node at least one address range in the remote memory address space on the remote node in which data is stored. can be written using one or more AM write requests sent by the remote node. Publishing a remote node access code corresponding to at least one address range registered with the local node; the inclusion of the access code in the first AM write request; and inspecting the access code by the AM manager on the remote node to validate that the first AM write request is allowed. A method for performing remote memory access (RMA) data transfers between a remote node and a local node, the method comprising: performing an RMA write in which data is written from the local node on the remote node by sending a first AM message request from the local node to an AM manager on the remote node, this first AM write request identifying a range of addresses in an address space of remote memory on the remote node where the data is to be written; Reading the data to be written to a source buffer on the local node and sending the data to the remote node; and processing the data that is received by the remote node so that the data to be written to the remote memory address space occupies the address range identified in the first AM write request. The method of claim 9, further comprising: receiving a plurality of AM write requests, each AM write request identifying a range of addresses in the remote memory address space on the remote node on which a data block associated with the AM write request which must then be sent is to be written, each AM write request including indices identifying the associated data block; queuing, using the AM manager on the remote node, each AM write request to an RMA write queue on the remote node; receiving a plurality of data blocks from the local node, each data block being associated with a previously received AM write request and containing indices by which the received AM write request 3025331 40 can previously to be identified; removing AM write requests from the RMA write queue; and processing each AM write request removed from the queue to determine whether the data block associated with the received AM write request is to be written to the remote memory address space . The method of claim 9, further comprising performing an RMA read in which data is read from the remote node and transferred to the local node, by sending an AM read request to the AM manager on the remote node, the AM read request identifying a range of addresses in the remote address space of the data to be read; by generating, in response to the reception of the AM read request message, via the AM manager on the remote node, an RMA read request corresponding to the read request AM identifying the range of addresses in the space of remote addresses of the data to be read; processing the RMA read request on the remote node so that the data is read in the remote address space as defined by the address range in the RMA read request retrieved from the address space remote and sent to the local node; and writing the retrieved data that is sent from the remote node to the destination buffer on the local node. The method of claim 11, further comprising: in response to receiving the read request message AM, generating and queuing, via the AM manager on the remote node, a RMA read request corresponding to the AM read request in an RMA read request queue on the remote node identifying the address range from the remote address space of the data to be read; Removing the RMA read request from the RMA read request queue; and retrieving data to be read in the remote address space as defined by the address range in the RMA read request and sending the retrieved data to the local node. The method of claim 12, wherein the queue removal, data retrieval, and sending operations are performed by an RMA read function on the remote node that is separate from the AM manager on the remote node. The method of claim 9, further comprising recording, at the local node, at least one address range in the remote memory address space on the remote node, wherein data can be written using one or more AM write requests sent from the remote node. 15 ° / Apparatus comprising: a network interface; memory, including a local memory address space; an Active Message Writing (AM) handler, configured to receive, via the network interface, a first AM write request sent by a remote device on a communication line coupled to the network interface the first AM write request corresponding to a remote memory access (RMA) write issued by the remote apparatus and containing first data to be written to a start address in the local memory address space, in the memory in which the data must be written; Processing the first AM write request by extracting the data from the first AM write request and writing the data in a range of addresses in the remote memory address space starting at the start address; and sending an AM write response to the remote apparatus, indicating that the data has been successfully written to the remote memory address space. Apparatus according to claim 15, further comprising: an RMA write module configured to perform an AM write operation in which a second AM write request corresponding to an RMA write on a space of remote memory addresses in the memory of the remote apparatus is generated and sent to the remote apparatus, the second write request 5 AM containing second data to be written to a start address in the memory address space remote in which the data must be written; and an AM write response manager configured to process an AM write response sent by the remote device after successful writing of the second data to the remote memory address space. Apparatus according to claim 16, further comprising a source buffer, and wherein the RMA write module is further configured to: divide third data to be written into the remote memory address space into one plurality of packets; For each of the plurality of packets, reading the packet data corresponding to the packet of the source buffer; and sending a corresponding AM write request to an AM write request handler on the remote device containing the packet data and a corresponding start address in the remote memory address space on the device remote in which the packet data must be written. Apparatus according to claim 15, wherein the write request handler AM is further configured to: receive a plurality of AM write requests from the remote apparatus, each of the plurality of write requests. AM containing packet data corresponding to a remote write of third data which is divided into a plurality of packets and must be written in a local memory address space, to a corresponding start address in the address space local memory; Extracting the packet data for each of the plurality of AM write requests and write the packet data to the local memory address space 3025331 starting at the start address identified by this AM write request ; detecting that all of the third data has been written to the local memory address space; and send an AM write response to the remote apparatus, indicating that the third data has been successfully written to the local memory address space. Apparatus according to claim 15, further comprising: a destination buffer in the local memory; An AM write response manager; and an RMA read module configured to generate and send an AM read request to an AM read request handler on the remote apparatus, the AM read request identifying a range of addresses in the remote address space where are the data of the remote apparatus to be read; receiving from the remote apparatus, in response to the AM read request, an AM read response containing the read data in the remote address space; and managing the AM read response with the AM read response handler by extracting the data from the AM read response and writing the data to the destination buffer. Apparatus according to claim 19, wherein the read module RMA is further configured to perform an RMA read in which data is read from the remote apparatus and transferred using a plurality of packets, the Data for each of the plurality of packets being read remotely, sending a corresponding AM read request to the AM read request handler on the remote apparatus, the AM read request identifying a range of addresses in the space remote address where the packet data is to be read; Receiving from the remote apparatus, in response to each AM read request, an AM read response containing the packet data read in the remote address space; and processing the read response AM with the read response manager AM by extracting the packet data from the read response AM and writing the packet data to the destination buffer. Apparatus according to claim 15, further comprising: an AM read request handler configured to receive an AM read request from the remote device identifying a range of addresses in the local address space containing the data to read; Recovering the data in the address range identified in the read request AM; and send an AM response to the remote device containing the data that is retrieved. 22 ° / Apparatus comprising: a network interface; memory, including a local memory address space; an Active Message Writing (AM) handler, configured to receive, via the network interface, a plurality of AM write requests sent by a remote device on a communication line coupled to the interface of network, each AM write request corresponding to a remote memory access (RMA) write issued by the remote apparatus and containing data to write the local memory address space and a start address to which a beginning of the corresponding data must be written; and generate an RMA write request and set the RMA write request to an RMA write queue; and an RMA progress module configured to receive a plurality of data blocks from the remote apparatus, each data block being associated with a previously received AM write request and containing indices by which the request for previously received AM write can be identified; 3025331 45 remove the RMA write requests from the RMA write queue; and processing each RMA write request removed from the queue to determine whether the received data block associated with the RMA write request is to be written to the local memory address space. Apparatus according to claim 22, further comprising: a source buffer in the memory and an RMA write module configured to send an AM write request to the remote apparatus, the AM write request ID an address range in a remote memory address space in the memory of the remote apparatus in which the data is to be written; read the data that must be written to the source buffer; and send the data to the remote apparatus, the data being sent after the AM write request is sent, and the data being sent with indices configured to be used to correspond to the data that is sent to the request for data. AM writing. 24. The apparatus of claim 22, further comprising: a queue of RMA read requests; an AM read request handler configured to receive an AM read request from the remote apparatus identifying a range of addresses in the local address space containing the data to be read; generating, in response to receiving the message of the read request message AM, an RMA read request. corresponding identifier identifying the range of addresses in the local address space containing the data to be read; and putting the RMA read request in the RMA read request queue; and an RMA progress module configured to remove the RMA read request from the RMA read request queue; 3025331 46 read the data of the local address space as it is defined by the address range in the RMA read request; and send the data that is read to the remote device. 25. The apparatus of claim 24, further comprising: a destination buffer occupying a portion of the memory; and an RMA read module configured to send an AM read request to the remote apparatus, the AM read request identifying a range of addresses to be read in a remote address space located in the memory of the remote apparatus ; receiving data from the remote device that has been read in the remote address space; and write the received data to the destination buffer.
类似技术:
公开号 | 公开日 | 专利标题 FR3025331A1|2016-03-04| EP1641197B1|2019-07-17|NoC | communication architecture for data stream applications US10592464B2|2020-03-17|Methods for enabling direct memory access | capable devices for remote DMA | usage and devices thereof EP0755010B1|2001-05-30|Interface device between a computer with redundant architecture and a communication means US20150032837A1|2015-01-29|Hard Disk and Data Processing Method US20140373147A1|2014-12-18|Scanning files for inappropriate content during synchronization EP3470982B1|2020-06-17|Method and device for dynamically managing the message retransmission delay on an interconnection network US10154079B2|2018-12-11|Pre-boot file transfer system FR3039023A1|2017-01-20|DEVICE AND METHOD FOR OPERATING A SYSTEM US20110191506A1|2011-08-04|Virtualization of an input/output device for supporting multiple hosts and functions EP2979222B1|2019-10-09|Method for storing data in a computer system performing data deduplication EP2497235B1|2013-08-14|Diagnostic tool for broadband networks EP2684129B1|2018-05-02|Methods, devices, and computer programs for optimizing the replication of data in a computer system US10880371B2|2020-12-29|Connecting an initiator and a target based on the target including an identity key value pair and a target characteristic key value pair US11003512B2|2021-05-11|System and method for optimizing bulk data operation EP2245544B1|2011-08-24|Mass storage device and storage system EP2909729B1|2019-08-14|Method and device for processing interruptions in a multiprocessor system EP2851793B1|2018-12-19|Method for configuring at least one node of a computer cluster, corresponding equipment and corresponding system CN114201421A|2022-03-18|Data stream processing method, storage control node and readable storage medium US8977771B2|2015-03-10|Managing a plurality of media files for distribution using an archive FR2794918A1|2000-12-15|Data packet transfer system uses header and data field system with header modified during transfer through bridges FR2996656A1|2014-04-11|Processing system i.e. cloud computing system, for deporting storage and data processing operations, has communication units storing data to be perennialized when auto-scaling module controls suppression of server instances FR2996709A1|2014-04-11|SERVICE PROVIDER EQUIPMENT PROVIDING RECEIVER PROCESSING OF REQUESTS, AND CORRESPONDING METHOD FR2794919A1|2000-12-15|Data communications using data packets includes use of reserved zones within data packets for routing information, and useful data FR2834839A1|2003-07-18|Communication apparatus multimedia digital word transmission management having first/second communications network with specific information about second unit processing deciding whether transmitted words changed second network
同族专利:
公开号 | 公开日 CN105389120A|2016-03-09| GB201513562D0|2015-09-16| KR20160027902A|2016-03-10| US9632973B2|2017-04-25| CN105389120B|2018-09-21| TWI582609B|2017-05-11| JP2016053946A|2016-04-14| KR101752964B1|2017-07-03| GB2531864B|2019-04-24| DE102015112634A1|2016-03-03| GB2531864A|2016-05-04| US20160062944A1|2016-03-03| TW201621699A|2016-06-16| JP6189898B2|2017-08-30|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题 US5448698A|1993-04-05|1995-09-05|Hewlett-Packard Company|Inter-processor communication system in which messages are stored at locations specified by the sender| JPH08180001A|1994-04-12|1996-07-12|Mitsubishi Electric Corp|Communication system, communication method and network interface| JP2736237B2|1995-03-06|1998-04-02|技術研究組合新情報処理開発機構|Remote memory access controller| US6240444B1|1996-09-27|2001-05-29|International Business Machines Corporation|Internet web page sharing| US20050080869A1|2003-10-14|2005-04-14|International Business Machines Corporation|Transferring message packets from a first node to a plurality of nodes in broadcast fashion via direct memory to memory transfer| US7536468B2|2004-06-24|2009-05-19|International Business Machines Corporation|Interface method, system, and program product for facilitating layering of a data communications protocol over an active message layer protocol| US7636813B2|2006-05-22|2009-12-22|International Business Machines Corporation|Systems and methods for providing remote pre-fetch buffers| US7694310B2|2006-08-29|2010-04-06|International Business Machines Corporation|Method for implementing MPI-2 one sided communication| US7979645B2|2007-09-14|2011-07-12|Ricoh Company, Limited|Multiprocessor system for memory mapping of processing nodes| US7925842B2|2007-12-18|2011-04-12|International Business Machines Corporation|Allocating a global shared memory| US8452888B2|2010-07-22|2013-05-28|International Business Machines Corporation|Flow control for reliable message passing| CN102306115B|2011-05-20|2014-01-08|华为数字技术(成都)有限公司|Asynchronous remote copying method, system and equipment|US9448901B1|2015-12-15|2016-09-20|International Business Machines Corporation|Remote direct memory access for high availability nodes using a coherent accelerator processor interface| JP6725662B2|2016-07-28|2020-07-22|株式会社日立製作所|Computer system and processing method| US20180089044A1|2016-09-27|2018-03-29|Francesc Guim Bernat|Technologies for providing network interface support for remote memory and storage failover protection| US20180285942A1|2017-03-29|2018-10-04|Oklahoma Blood Institute|Fundraising Platform| CN110704343B|2019-09-10|2021-01-05|无锡江南计算技术研究所|Data transmission method and device for memory access and on-chip communication of many-core processor|
法律状态:
2016-11-11| PLFP| Fee payment|Year of fee payment: 2 | 2017-06-28| PLFP| Fee payment|Year of fee payment: 3 |
优先权:
[返回顶部]
申请号 | 申请日 | 专利标题 US14/475,337|US9632973B2|2014-09-02|2014-09-02|Supporting RMA API over active message| 相关专利
Sulfonates, polymers, resist compositions and patterning process
Washing machine
Washing machine
Device for fixture finishing and tension adjusting of membrane
Structure for Equipping Band in a Plane Cathode Ray Tube
Process for preparation of 7 alpha-carboxyl 9, 11-epoxy steroids and intermediates useful therein an
国家/地区
|