![]() DISTRIBUTED CALCULATION SYSTEM IMPLEMENTING NON-SPECULATIVE MATERIAL TRANSACTIONAL MEMORY AND METHOD
专利摘要:
Distributed computing system comprising a plurality of computing units (CPU) and a shared memory (MP) between said computing units, characterized in that it comprises at least one access conflict detection hardware module (INSP) said computing units to said shared memory; said or each said hardware conflict detection module being configured to: store at least one probabilistic data structure, indicative of all the addresses of said shared memory involved in all of the current transactions; receiving at least one message indicative of an access request, by a said computing unit, to at least one address of said shared memory; determining, from said probabilistic data structure, whether said address is already involved in a current transaction, and transmitting to said calculation unit a message of presence or absence of access conflicts; receiving at least one indicative or confirmative message of a reservation or release of at least one said address of said shared memory, and updating said probabilistic data structure so that the reserved addresses and the freed addresses are considered, respectively, as / not being involved in an ongoing transaction. Method of using such a system 公开号:FR3019921A1 申请号:FR1453216 申请日:2014-04-10 公开日:2015-10-16 发明作者:Julien Peeters;Nicolas Ventroux;Tanguy Sassolas;Marc Shapiro 申请人:Commissariat a lEnergie Atomique CEA;Institut National de Recherche en Informatique et en Automatique INRIA;Commissariat a lEnergie Atomique et aux Energies Alternatives CEA; IPC主号:
专利说明:
[0001] The invention relates to a distributed computing system implementing a material non-speculative transactional memory, and as its method of use for distributed computing systems. It comes under the domain of parallel computer architectures, on-chip or organized in grid computing. It particularly concerns distributed, distributed and / or redundant computing systems for embedded applications or such as cloud computing, databases, web servers, supercomputing, etc. In order to meet the growing need for application performance, the number of computing resources (processors or processor cores) in parallel architectures is constantly increasing. This brings into being both the need and the problematic of effectively programming this type of architecture in order to benefit at best from the available computing power. When running a parallel application, it can happen that two or more tasks have to exchange data said shared. To ensure system consistency throughout the execution of the application, access to the memory system must be protected. For this, the programmer declares regions in the application code, called critical sections, which guarantee exclusive access to the memory system for any task that is entitled to it. The shared memory model, which is in the majority today, is based on the use of lock-based synchronization primitives (or any lock-like variant) to protect access to shared data within critical sections. However, these primitives hardly achieve scaling. This limitation complicates the programming of parallel applications and requires a significant investment in time to reach an acceptable level of performance. In addition, placing a lock at the beginning of a critical section only guarantees exclusive access to the lock, not the shared data itself. Therefore, the use of a lock does not guarantee the effective protection of the data shared between the tasks but only the exclusive access to the sequence of instructions that uses them. The delimitation responsibility of the critical sections is left to the programmer which is the source of important errors. In order to take full advantage of the computing power present in modern massively parallel architectures, a more promising approach is the use of transaction memories. A transactional memory transforms each access to the memory system into a "transaction" that has the following properties: atomicity, consistency, isolation, and durability (acronym "A.C.I.D."). "Atomicity" means that the simultaneous execution of several transactions must give the same result as their subsequent execution. - "Consistency" means that a transaction must bring the system from one valid state to another valid state. - "Isolation" means that the shared data updates used by the transaction are propagated to the rest of the system 20 only after the transaction is completed and therefore validated. - "Durability" means that updates once propagated can no longer be canceled. The concept of transactional memory was introduced in 1993 by the article by M. Herliy and JEB Moss "Transactional Memory: Architectural 25 Support for Lock-Free Data," 20th Annual Symposium on Computer Architecture, pages 289 - 300. This article discloses in particular a hardware device for implementing a transactional memory, based on an associative cache memory. A notable disadvantage of this solution is that it is blocking. The article by Nir Shavit and Dan Touitou "Software Transactional Memory" Proceedings of the 14th ACM Symposium on Principles of Distributed Computing, pages 204-213 has proposed a purely software and non-blocking realization of a transactional memory. These transaction memories known from the prior art are of the "speculative" type. This means that a transaction is initiated on the assumption that it will not lead to a shared memory access conflict; if such a conflict is detected during execution, the transaction is canceled without leaving any traces (to respect the insulation property). In a speculative transactional memory, the means necessary to guarantee the coherence of the system are very expensive in terms of the memory footprint (memory space required to save the valid state of the system before the start of the transaction), the management of error returns in case of bad speculation etc. Thus, these means are unsuited to areas such as embedded systems. More generally, they unnecessarily consume resources that could be allocated to computing tasks. It is also known to perform, in software, non-speculative transaction memories, in which any conflicts are detected before the start of execution of a transaction. The use of non-speculative transaction memories makes it possible to reduce the memory footprint and increase the energy efficiency of the system compared with the speculative approach. However, due to the lack of speculation, all data accessed by a transaction must be reserved at one time (ie, atomically) before the first operation of the transaction is executed, so to guarantee the absence of interbolocage ("dead-lock") during the reservation; the latter guaranteeing the properties of A.C.I.D. ". Moreover, in the general case, a transaction can access an arbitrary number of data which supposes the possibility of reserving in an atomic way an arbitrary and variable number of data. However, the atomic reservation of several data in an intrinsically parallel system such as a multiprocessor computing architecture is a non-trivial problem to be solved. The reservation has two major parts: the declaration of the set of data to be reserved and the detection of conflicts between this set and any other set of data already reserved by one or more other transactions. A naïve and simplistic realization would be to use a global lock that stops the execution of the whole system during the time of the reservation, which creates a total order between the reservations of the same data and prevents the appearance of a deadlock; however, this solution would lead to unacceptable degradation of performance; in particular, it ignores the case where two transactions reserve two totally disjoint sets of data. US 5,742,785 discloses a multiple data reservation mechanism via dedicated hardware registers associated with each computing unit. The mechanism makes it possible to verify that data are reserved and to then proceed to their atomic writing in memory. Nevertheless, the reservation of the data is not, in itself, atomic. Indeed, for a set of starting data, the reservation of certain data may fail and if necessary update a validity flag linked to each non-reserved data. The absence of atomic reservation for a set of data does not satisfy the necessary and sufficient conditions to realize a non-speculative transactional memory. The invention aims to remedy the aforementioned drawbacks of the prior art. More particularly, it aims to perform efficiently and effectively non-speculative transactional memory to reserve non-blocking - a set of data of variable and arbitrary size. According to a particularly advantageous aspect, the invention implements a material and distributed method of conflict detection having a fixed cost in terms of silicon surface regardless of the number of data manipulated. In addition, the proposed conflict detection method makes it possible to separately manage disjoint transactions, thus offering an additional performance gain that is essential for scaling up. An object of the invention is therefore a distributed computing system comprising a plurality of calculation units and at least one shared memory between said calculation units, characterized in that it comprises at least one hardware module for detecting conflicts. accessing said computing units to said shared memory; said or each said hardware conflict detection module being configured to: - store at least one probabilistic data structure, indicative of all the addresses of said shared memory involved in all of the current transactions; receiving at least one message indicative of an access request, by a said computing unit, to at least one address of said shared memory; determining, from said probabilistic data structure, whether said address is already involved in a transaction in progress, and transmitting to said calculation unit a message indicating the presence or absence of access conflicts; and - receiving at least one indicative or confirmative message of a reservation or release of at least one said address of said shared memory, and updating said probabilistic data structure so that the reserved addresses and the addresses released are considered, respectively, as being / not being involved in an ongoing transaction. According to one embodiment, such a distributed computing system may comprise: at least one hardware Bloom filter for storing said or each said probabilistic data structure; at least one hardware Bloom filter for temporarily storing said or each said probabilistic data structure; at least one hash function module for addressing said Bloom filters; and at least one logic circuit for generating a said presence or absence of access conflicts message from said or at least one said probabilistic data structure, and to update said at least one said data structure. probabilistic data following receipt of at least one indicative or confirmative message of a reservation or release of at least one said address of said shared memory. [0002] According to one embodiment, said or each said hardware module of conflict detection can be configured to: - receive at least a first type of message indicative of an access request, by a said computing unit, to at least one address of said shared memory; determining, from said probabilistic data structure, whether said address is already involved in a transaction in progress, and transmitting to said calculation unit at least a second type of presence message or absence of access conflicts; Receiving at least a third type of message indicative of a request to release a said shared memory address, which is no longer involved in a transaction and update said probabilistic data structure accordingly; and - receive at least a fourth type of message, indicative of a validation or abandonment of at least one said access or release request and, if validated, update said probabilistic data structure Consequently. Moreover, said or each said access conflict detection hardware module may comprise: at least one first hardware Bloom filter for temporarily storing a probabilistic data structure indicative of at least one memory address indicated by a message said first type or said third type; at least one second counting-type hardware Bloom filter for storing said probabilistic data structure indicative of all the addresses of said shared memory involved in one or more current transactions; at least one logic circuit for generating a message of said second type by comparison between the probabilistic data structures stored in said first and second hardware Bloom filters, and for updating said probabilistic data structure stored in said second data filter; Bloom material from that temporarily stored in said first Bloom filter after receiving a message of said fourth type, indicative of a validation of an access request or release of an address of said shared memory. [0003] As a variant, said or each said access conflict detection hardware module may comprise: a first hardware Bloom filter for temporarily storing a probabilistic data structure indicative of at least one memory lease indicated by at least one message of said first type, to which a said computing unit requires read access; a second counting-type hardware Bloom filter for storing a first portion of said probabilistic data structure indicative of all the rentals of said shared memory involved in one or more current transactions, said first portion being indicative of all rentals of said shared memory involved in reading in one or more transactions in progress; a third hardware Bloom filter for temporarily storing a probabilistic data structure indicative of at least one memory location indicated by at least one message of said first type, to which a said computing unit requires write access; a fourth hardware count Bloom filter for storing a second portion of said probabilistic data structure indicative of all locations of said shared memory involved in one or more current transactions, said second portion being indicative of all the rentals of said shared memory involved in writing in one or more transactions in progress; at least one logic circuit for generating a message of said second type by comparison between the probabilistic data structures stored in said first and second hardware Bloom filters or said first and fourth hardware Bloom filters, to update said first part of said probabilistic data structure stored in said second material Bloom filter from that stored temporarily in said first Bloom filter following receipt of a message of said fourth type, indicative of a validation of a query of accessing or releasing a location of said shared memory, for generating a message of said second type by comparison between the probabilistic data structures stored in said third and said second material Bloom filter or said third and fourth hardware Bloom filters, and to update said second part of said struct a probabilistic data file stored in said fourth material Bloom filter from that stored temporarily in said third Bloom filter after receiving a message of said fourth type indicative of a validation of an access request or release of a rental of said shared memory. According to another embodiment, said or each said hardware module for conflict detection can be configured to: - receive at least a first type of message indicative of an access request, by a said computing unit, at least renting said shared memory and updating said probabilistic data structure accordingly; determining, from said probabilistic data structure, whether said location is already involved in a transaction in progress, and transmitting to said calculation unit at least a second type of presence message or absence of access conflicts; receiving at least a third type of message indicative of a request to release a said shared memory lease which is no longer involved in a transaction and update said probabilistic data structure accordingly. In this case, said or each said access conflict detection hardware module may comprise: at least one counting type hardware Bloom filter 30 for storing said probabilistic data structure indicative of all the rentals of said shared memory involved in one or more ongoing transactions; and at least one logic circuit for generating a message of said second type from said probabilistic data structure and a message of said first type, and for updating said probabilistic data structure stored in said hardware Bloom filter following reception of a message of said first or third type In this case, moreover, said or each said access conflict detection hardware module may comprise: a first hardware-based Bloom filter, of the counting type, for storing a first part of said probabilistic data structure indicative of all the rentals of said shared memory involved in one or more transactions in progress, said first part being indicative of all the rentals of said shared memory involved in reading in one or more transactions In progress ; and a second counting-type hardware Bloom filter for storing a second portion of said probabilistic data structure indicative of all locations of said shared memory involved in one or more current transactions, said second portion being indicative of all rentals of said shared memory involved in writing in one or more transactions in progress. According to another embodiment, said or each said access conflict detection hardware module may comprise a hardware Bloom filter for temporarily storing said or each said count type probabilistic data structure. The or each said access conflict detection hardware module may include a first pair of Bloom filters for detecting read access conflicts and a second pair of Bloom filters for detecting write access conflicts. . [0004] According to one embodiment, such a distributed computing system may comprise a plurality of said access conflict detection hardware modules, each associated with a segment of said shared memory. In this case, each computing unit may comprise a memory access initiator module and a hardware module for acquiring said access conflict detection hardware modules, each said hardware acquisition module being configured to perform an exclusive acquisition and the at least one hardware module for detecting access conflicts necessary to detect access conflicts for a transaction. [0005] Still in this case, furthermore, each said acquisition hardware module may be configured to: - receive at least one message indicative of an access request, by said memory access initiator module, to an address of said memory shared; and identifying, from this message, the access conflict detection hardware module associated with said address; receiving and storing a message indicative of a set of hardware modules for detecting access conflicts in use; then determining, by means of said stored message, whether said identified detection hardware module is in use and: if it is not, updating said indicative message of a set of hardware detection hardware modules. access conflicts in use to indicate that said hardware access conflict detection module is now in use and to transmit it, and to transmit to said detection hardware module identified said message indicative of a request for access access; otherwise transmitting without changes said indicative message of a set of access conflict detection hardware modules in use; subsequently, again receiving said message indicative of a set of access conflict detection hardware modules in use, updating it to indicate that the access conflict detection hardware module (s) associated with the Or at the memory addresses implied by said transaction are no longer in use and transmit it; receiving, from at least one access conflict detection hardware module, at least one presence message or absence of access conflicts and, if at least one said message is indicative of the presence of a conflict of access; accessing, transmitting to said one or more hardware modules for detecting access conflicts a message for abandoning or canceling said or each said access request; receiving and storing another message indicative of a set of access conflict detection hardware modules in use; then receiving, from said memory access initiator module, a message indicative of the completion of a transaction involving one or more addresses of said shared memory, updating said message indicative of a set of hardware modules for conflict detection; in-use access stored to indicate that the access conflict detection hardware module (s) associated with the memory address (es) involved by said transaction are no longer in use and transmit it, and transmit to at least one access conflict detection hardware module at least one message indicative of a request to release a said shared memory address, which is no longer involved in a transaction. In addition, said hardware acquisition modules can be interconnected by a communication network implementing an exclusive non-blocking access technique. In addition, said acquisition hardware modules can be interconnected by a network having a ring-type logical topology and be configured to transmit step by step on said network a token 25 carrying said message indicative of a set of hardware modules. detection of access conflicts in use. According to one embodiment, such a distributed computing system may comprise a plurality of tiles, said shared memory and a communication network connecting said tiles to each other and to said shared memory and at least one said hardware conflict detection module. access, each said tile comprising at least one said computing unit. [0006] Another object of the invention is a method of using such a distributed computing system comprising the following steps: a) using a computing unit to transmit to at least one access conflict detection hardware module at least a message indicative of a request to access an address of a shared memory; b) using said or each said access conflict detection hardware module to determine, from a respective probabilistic data structure, indicative of a set of addresses of said shared memory involved in an ongoing transaction, if said address is already involved in a current transaction, and for transmitting a presence or absence of access conflicts message addressed to said computing unit; c) using said calculation unit to determine, from the presence message (s) or absence of access conflicts received from said or each said detection hardware module, whether a transaction involving said or each address of said shared memory can be performed or not, and to transmit to said or each said detection hardware module at least one message indicative of a reservation or release of at least one said address of said shared memory; and using said or each said detection module to update said probabilistic data structure so that the reserved addresses and the freed addresses are considered, respectively, as being / not being involved in a current transaction. According to one embodiment of such a method: said distributed computing system may comprise a plurality of said computing units, a said shared memory and a plurality of said access conflict detection hardware modules, each associated with at least one address range of said shared memory; each said computing unit comprises a memory access initiator module and a hardware module for acquiring said hardware modules for detecting access conflicts; Said step a) may comprise the following operations: a1) using a hardware acquisition module to receive and store a message indicative of a set of access conflict detection hardware modules in use; a2) using a memory access initiator module associated with said hardware acquisition module for transmitting to said hardware acquisition module at least one said message indicative of a request to access an address of said shared memory; a3) using said acquisition hardware module to identify the hardware access conflict detection hardware module (s) associated with said or each said memory address; a4) determining, by means of said indicative message of a set of access conflict detection hardware modules in use, whether the access conflict detection hardware module or modules associated with said or each said address memory of the transaction are in use and: if they are not, transmit them said message indicative of an access request, update said indicative message of a set of hardware modules for detecting access conflicts in use to indicate that said or each said hardware access conflict detection module is now in use and transmit it; otherwise transmit it without modification; And said step b) may comprise the following operations: b1) using said hardware acquisition module to receive, from at least one access conflict detection hardware module, at least one presence message or no conflicts of access access and, if said or all said messages are indicative of a lack of access conflicts, transmit to said 25 or 25 hardware modules for detecting access conflicts confirmation messages of said access requests; b2) using said acquisition hardware module to receive an indicative message that a transaction involving one or more addresses of said shared memory has been completed, updating said indicative message of a set of conflict detection hardware modules; in use access to indicate that the acquisition hardware module or modules associated with the memory address (es) involved by said transaction are no longer in use and transmit it. Advantageously, said hardware acquisition modules may be interconnected by a communication network implementing a non-blocking exclusive access technique. In addition, said acquisition hardware modules may be interconnected by a network having a logical topology of the ring type, said message indicative of a set of hardware modules for detecting access conflicts in use being transmitted from step by step on said network. Other characteristics, details and advantages of the invention will emerge on reading the description made with reference to the appended drawings given by way of example and which represent, respectively: FIG. 1, the architecture of a control system distributed computing according to one embodiment of the invention; FIG. 2, the association between hardware modules for detecting access conflicts and ranges of the distributed memory of the system of FIG. 1; FIG. 3, the synchronization between hardware modules 20 for acquiring access conflict detection modules in the system of FIG. 1; FIGS. 4A and 4B, the sequence diagram of a method according to one embodiment of the invention, implemented by means of the system of FIG. 1; FIGS. 5A and 5B, respectively, the functional diagram of a hardware module for acquiring access conflict detection modules according to a possible embodiment of the system of FIG. 1 and a diagram illustrating the operation of FIG. availability check of access conflict detection modules; FIGS. 6A and 6B, respectively, the functional diagram of an access conflict detection hardware module and a diagram illustrating the data reservation and release operation according to a possible embodiment of the system of FIG. 1. ; FIG. 7, a detail of the functional diagram of an access conflict detection hardware module according to another possible embodiment of the system of FIG. 1; and FIG. 8, the functional diagram of a hardware module for detecting access conflicts according to another embodiment of the system of FIG. 1. A calculation system according to the invention can be implemented by means of FIG. a computer architecture distributed on a chip or organized in a grid. In both cases, the system preferably has a modular structure, composed of "tiles" or "nodes" comprising one or more computing resources. Such a system is characterized by a distributed hardware locking mechanism that allows the atomic acquisition of multiple hardware or software resources. This mechanism is based on a non-blocking synchronization that allows a verification of the availability of resources to acquire and, where appropriate, their acquisition. Compared to the known transaction memories of the prior art, such a system has at least one of the following advantages: It allows an atomic reservation of several values (memory addresses); therefore, it meets the necessary and sufficient conditions to apply to non-speculative transactional memories; - It does not depend on a software mechanism or an additional computing resource to operate; Its silicon cost and its memory footprint are fixed for a given hardware architecture regardless of any software characteristic. FIG. 1 illustrates the architecture of a distributed computing system according to one embodiment of the invention. This architecture comprises a plurality of "tiles" or "nodes" TO, T1, TN (homogeneous or heterogeneous), each comprising at least one calculation unit UC. The architecture also includes at least one INSP hardware access conflict detection module (also known as an "inspector") containing the logic necessary to perform conflict detection between transactions initiated by the calculation units. In the case of Figure 1, each tile includes exactly one inspector. The system also includes a shared memory MP consisting of (N + 1) memory areas PMO, PM1, ..., PMN, to which all the computing units can access write and / or read, and a network of RCOMM communication, dedicated or hierarchical, ensuring the interconnection of the tiles with each other and with the memory ranges. The network RCOMM 10 may be a network-on-a-chip in the case of an integrated realization of the system. As illustrated in Figure 2, each memory range includes a set of memory locations having consecutive addresses. Each address of the shared memory is present in a memory range. In an optimal embodiment, each address of the shared memory is present in one and only one memory range (case considered here). Each said memory range is associated with one and only one access conflict detection module; thus, the INSPi module detects the conflicts that occur when several computing units seek to simultaneously access the same memory address of the range PMi, with i = 0 - N. Each CPU calculation unit comprises a processor (or core of processor, but in the following we will speak simply of "processor") PR, which performs the operations of calculation and data processing and which is actually at the origin of the transactions, and an acquisition module (also called 25 "collector" ) COLL which serves as an interface between the processor and the modules for detecting access conflicts to the ranges of the shared memory. Each acquisition module receives requests from the associated processor via a programming interface composed of dedicated registers (RC, RD, RS in FIG. 5A) identified by specific memory addresses. Depending on the type of request, the acquisition module may synchronize with the other acquisition modules and / or initiate a conflict detection involving one or more access conflict detection modules. These two steps form a two-phase protocol that makes it possible to perform a memory transaction. In FIG. 1, arrows symbolically represent the message exchanges (which will be detailed below) between the processors and the acquisition modules, between acquisition modules, and between COLL acquisition modules and INSP conflict detection modules. . All these exchanges take place through the RCOMM network. Several variants of the architecture of Figure 1 can be envisaged, without departing from the scope of the present invention. As non-limiting examples: memory ranges can be physically integrated with tiles or nodes; preferably, the same tile or node will contain the conflict detection module corresponding to the memory slots integrated in the node. The number of memory ranges, and therefore of conflict detection modules, can be different from the number of calculation units, but all the memory ranges PMi are allocated to detection modules. A single conflict detection module can handle all shared memory, but this embodiment is not optimal, especially for a large number of processors. - The COLL acquisition modules may be absent, in which case their functionalities may be implemented, in software, by the corresponding processors (this is not necessary in the case of a single conflict detection module). - The single communication network RCOMM can be replaced by a plurality of networks dedicated to the different types of messages exchanged. In another implementation, a software solution using shared memory can also be used for this purpose. For example, the messages can be writes and reads on memory addresses that are not hidden and visible by all the nodes whose integrity is guaranteed by atomic access mechanisms. [0007] Synchronization between the acquisition modules is necessary for the proper functioning of conflict detection. Indeed, the characteristics of the non-speculative transaction memories require that the conflict detection modules are acquired in an atomic way to guarantee the coherence of the transactional system. In other words, a COLL module can "acquire" an INSP module to check whether a transaction can be completed only if the transaction is not already "reserved" for verification of another transaction. According to a particular embodiment of the invention, this synchronization is based on a unique and exclusive J token which circulates between the acquisition modules on an RA network having a logical ring topology, as illustrated in FIG. J contains the reservation status of all INSP modules. One possible implementation consists of a bit vector, each bit representing the reservation or not of an INSP module. The RA network may correspond to a particular use of the single communication network RCOMM, or have an independent physical existence. Other embodiments are possible; in fact, any atomic mechanism for acquiring an exclusive right can be used. For example, synchronizations can be writes and reads on memory addresses that are not hidden and visible by all the nodes whose integrity is guaranteed by atomic access mechanisms. In order for a memory transaction to be executed by a processor PR, the latter must first initiate a request to start the transaction with its acquisition module COLL and send it the memory addresses of the data that will be used by the transaction. In the first phase of the conflict detection protocol, these addresses will be used to identify the INSP modules necessary for the detection of access conflicts. Each of these modules is identified or not according to the memory address ranges that it manages and the addresses of the data of the transaction. More specifically, an INSP module is identified as soon as at least one of the data involved in the transaction is stored in the memory range (s) associated with it. At the end of this identification, the COLL acquisition module waits to receive the token to check the availability of previously identified INSP conflict detection modules. If they are available, they are reserved or "acquired" - which, in the embodiment considered here, is to update the token. The token is then released, and the second phase can begin. [0008] Otherwise, the token is released and the acquisition module waits for it to return to make an attempt to acquire the access conflict detection modules. The acquisition module can also inform the processor of the unavailability of the selected access conflict detection modules; in this case, the transaction is put on hold. [0009] The second phase of the protocol consists of the actual conflict detection. For this purpose, the COLL acquisition module sends specific requests to the identified INSP modules to proceed with the reservation of the data used by the transaction. These queries are intended to verify the presence or absence of the data used by the transaction in data structures stored by the corresponding INSP modules. If a data is present in such a data structure, it means that it is already used by another transaction, which corresponds to a conflict. At the end of the second phase of the protocol, depending on the absence or the presence of conflicts, the processor is respectively authorized or not to execute the transaction and the overall state of the data reservations is updated in the modules. INSP on order of the COLL acquisition module. The INSP modules are released by the COLL acquisition module when the token returns to it. At the end of the transaction, a very similar protocol is reused to update the overall state of the data reservations, but this time by releasing the data used by the transaction. Therefore, at this stage the protocol does not include conflict detection. Figures 4A and 4B illustrate in detail the different steps of a particular embodiment of this protocol. Figure 4A relates more specifically to the reservation phase of the INSP modules and the access rights to the transaction data and the conflict detection phase ("preamble" of the transaction). Figure 4B illustrates the release phase of INSP modules and data access rights of the transaction ("epilogue" of the transaction). These figures relate to the case where a processor PR2, assisted by the associated acquisition module COLL2, initiates a transaction involving data stored in a range PM5 of the shared memory, supervised by an INSP5 conflict detection module. The acquisition modules COLL1 and COLL3, which precede and follow COLL2 in the ring network RA, intervene only for the passage of the token J. In the following and in these figures, for the sake of brevity, we speak of "collectors" and "inspectors" to indicate the COLL and INSP modules, respectively. [0010] As explained above, the preamble of the transaction (Figure 4A) includes a first phase (Phase 1) of identification (stage 1) and reservation or "acquisition" (stage 2) of the inspectors, followed by a second phase ( Phase 2) of conflict detection (step 3) and release of inspectors (step 4). [0011] In step 1, the processor PR2 sends the collector COLL2 a "TX_START" message indicative of the start of a transaction, followed by a series of messages containing the addresses of the data involved in the transaction. As it receives the addresses, the collector COLL2 sends acknowledgments to the processor (dotted arrows in the figure) and determines the memory ranges, and therefore the corresponding inspector modules. A "TX_DATA_END" message indicates the end of the sending, and of step 1. Step 2 begins when the token is received by the COLL2 module from the COLL1 module. It consists of the acquisition (or "reservation") of the inspector modules identified during step 1 (in this case, the only INSP5 module). In the embodiment considered here, the token J is a message that circulates on the ring network RA and contains a list of the inspector modules that have been acquired by the different collector modules. When it receives the token J, the module COLL2 thus checks whether the inspector module INSP5, which it intends to acquire, is marked as reserved in this list. If so, it releases the token by transmitting it to the COLL3 module and resumes waiting to receive it again; otherwise, it marks INSP5 as reserved in the list (which constitutes the acquisition of this inspector module), transmits to the COLL3 module the updated token and proceeds to step 3. At the beginning of step 3, the module collector COLL2 transmits to the selected and acquired inspector modules (here, the only module INSP5) messages of a first type, containing the addresses of the data involved in the transaction and corresponding to each said inspector module. This sending can be done address by address or by packets. The inspector modules comprise a memory storing a probabilistic data structure (for example a counting Bloom filter, as will be explained in detail below) indicative of the data already involved in a transaction. Each INSP module detects, with the help of this data structure, potential access conflicts and sends in response messages of a second type indicating the presence or absence of conflicts (the conflict detection operation will be described later). A message of a fourth type, validation or abandonment, is also sent to the inspector modules. More specifically, two cases may arise: (1) When COLL2 has received all the confirmations indicating the absence of conflict for each address involved in the transaction, the latter sends to each inspector INSP involved in the transaction (here INSP5) a message of the fourth type indicating a validation of the transaction. The probabilistic data structure stored in each inspector module is then updated to take into account the reservation of the data involved in the transaction. (2) On the other hand, if at least one inspector detects a conflict, a message of the fourth type indicating abandonment is sent by COLL2 to each inspector (here INSP5). Thus the probabilistic data structure of each inspector involved in the transaction remains unchanged. Irrespective of the case encountered, the inspectors send back an acknowledgment message of the validation or abandonment message of the transaction. As soon as a conflict is detected, or when no conflict is detected, the COLL2 module also sends a message to the processor PR2. The latter will then read a status register to find out whether or not there is a conflict for the transaction. If there is no conflict, the transaction can run. This completes step 3. [0012] At the beginning of step 4, the collector module COLL2 can proceed with the release of the inspector modules that it has acquired previously (INSP5 in this case), and which it does not need any more for the moment. This takes place while the PR2 processor executes the transaction. To release the inspector modules, it waits for the token J. Upon receipt, the collector module COLL2 updates the token J by erasing the inspector INSP5 from the list of acquired inspector modules, and retransmits the token J to the next collector. The advantage of having a COLL2 collector module independent of the processor is that this release operation does not interfere with the execution of the transaction. [0013] The epilogue (FIG. 4B) begins after the end of the transaction, when the processor PR2 sends a signal "TX_END" to the collector module COLL2. The latter therefore starts waiting for the token and, when it receives it, proceeds to the acquisition of the inspector modules as explained above with reference to step 2 of phase 1 of the preamble. Advantageously, the collector module stores in memory the addresses of the data of the transaction, which avoids the processor having to retransmit (see step 1 of phase 1 of the preamble). Such retransmission may nevertheless be necessary in case of saturation of this memory. The processor is informed of this event by acknowledgments received during Step 1 of Phase 1 of the Preamble. During the phase 2 / step 3 of the epilogue, the collector module COLL2 transmits to the inspector module INSP5 messages of a third type, containing the addresses of the data that were involved in the transaction that has just ended. As for the reservation, this shipment can be addressed by address or by packages. The inspector modules release the access rights to the data addresses in a manner that will be detailed later, and which is symmetrical to their reservation. A fourth type of validation message sent from the COLL2 collector to the INSP5 inspector completes the data release phase. The epilogue of the transaction ends with a fourth stage of release of the inspector modules (here INSP5), identical to the corresponding step of the preamble, except that the inspector modules marked as reserved and concerned by the transaction that ends are marked as free. FIG. 5A illustrates a block diagram of a COLL acquisition module (or "collector") that can be used for the implementation of the protocol described above with reference to FIGS. 4A and 4B. This module comprises: A controller CTRC which manages all the operations performed by the collector, namely: (1) the exchange of messages with the processor PR (via a dedicated interface) and with the acquisition modules (via the RCOMM network), and (2) synchronization with other collectors (also via the RCOMM network, which may include a dedicated ring network). This controller may comprise, or consist of, a state machine and, where appropriate, one or more interface circuits. A set RG of general registers, in particular comprising an RC command register, an RD data register and an RS status register. These registers are accessible from the associated processor (via the CTRC controller) and allow the exchange of data between the latter and the collector module. They are associated with specific memory addresses and thus constitute a programming interface for the PR processor. Specifically, the processor writes in the RC register the commands "TX START", "TX DATA END" and "TX END" mentioned above, which control the operation of the controller CTRC. He writes to the register RD the addresses of the data to be reserved (and, where appropriate, to be released) and reads in the register RS the presence or absence of an access conflict (or an error, or of an overflow of the data structure). - A temporary storage memory MFA, realizing a queue for the addresses sent by the processor. This memory makes it possible to limit the repeated sending of the same addresses to the collector during the different stages of a transaction (prologue and epilogue). In particular, the presence of this memory makes it possible to avoid the retransmission of the addresses of the data to be released during the epilogue of the transaction. Such retransmission is necessary in case of saturation of the memory. - A FHMC module realizing, in a hardware way, a hash function, which makes it possible to select a single inspector module from an address value sent from the processor. The result of the hash, meanwhile, allows to set the bit corresponding to the inspector selected in the reservation register (see below). Each FHMC module performs the same hash function, so each address is associated with the same inspector module regardless of the collector concerned. An RRV reservation register and an RSYN synchronization register, as well as a bit-to-bit comparator CMP between these registers to check the availability of the "inspector" modules required for a transaction. Each of the RRV and RSYN registers includes a one-bit memory cell for each system inspector module. As mentioned above, at the beginning of the prologue of a transaction the processor writes in the control register RC of the collector module associated with it a command "TX_START" in order to notify him that a new transaction is initiated. Then, the processor sends each data address by writing it to the RD data register. Each write on this register generates a backup of the value of the address of the data in the queue and MFA storage and is also transmitted to the FHMC hash module. The FHMC module returns a value corresponding to an index in the RRV reservation register. This index represents the inspector module in charge of the address range in which the address of the data previously transmitted to the collector is located. The end of the sequence of sending the data addresses to the collector is materialized by writing, in the control register RC by the processor PR, a command "TX_ DATA_ END". At the end of this sending, the RRV reservation register represents all the inspector modules necessary for the detection of conflicts for the new transaction. Each bit of RRV at '1' represents an inspector to acquire. Then, the collector module COLL receives, via the controller CTRC, the token J which rotates between the different collector modules and which contains a list of inspector modules already acquired (which avoids having to access a storage element centralized). In this embodiment, the token J is composed of a word of n bits. Each bit of J thus represents the reservation, or not, of an INSP inspector. The token J received by the COLL module is temporarily stored in the RSYN synchronization register. [0014] Verification of the availability of the inspector modules awaiting acquisition is performed by performing, by means of the comparator CMP, a bit-to-bit comparison between the RRV reservation register and the RSYN synchronization register. The result - for example a value "0" if all the inspector modules to be reserved are available, a value "1" if at least one is not - is transmitted to the controller CTRC. If none of the inspecting modules awaiting acquisition is already used by another transaction ("0" output of the comparator), then the synchronization register (and thus the value of the token J) is updated to reflect the acquisition of the inspector modules by the collector. This can be done via a bitwise "OR" operation of the RRV and RSYNC registers, for example. On the other hand, if an inspector module is already used by another transaction (and therefore the corresponding bit is "1" in the synchronization register, which results in an output "1" of the comparator), then the value of the token is not updated and the collector transfers it to the next collector until it returns to the next turn. Figure 5B illustrates this synchronization protocol between collectors. [0015] If it receives a "0" value from the comparator, the CTRC controller can trigger the data reservation phase, which consists in sending the address of each of them to the corresponding inspector modules in order to search for the presence of a conflict. eventual with another transaction. The data (more exactly, their addresses) temporarily stored in the memory MFA are reused if the latter has a sufficient capacity; otherwise, the processor must return the transaction data in the same way as in the first step. The reservation of the data is carried out by a specific exchange protocol between the collector and the inspectors, implemented via the CTRC controller. Each request sent by the collector contains the memory address of the data. The inspector's response contains two flags: an acquittal and the presence of a conflict or an overflow. The first query issued to an inspector also contains a command to distinguish the acquisition and release requests. The acquittal, received from the inspector in response to the sending of each address, ensures that the search for conflicts has been carried out. The status register RS is updated with each response of an inspector according to the presence of an error, a conflict, an overflow or any combination of these three events. This register is read by the processor, which is thus informed of the possible presence of access conflict making the transaction impossible. At the end of sending all the data of the reservation request, the collector sends each inspector a last message containing a validation command (in the absence of access conflicts) or abandonment (in the presence at least one such conflict) encoded on two bits. At the end of the prologue, after reception the inspector modules that were used are released in the same manner as they were acquired. For this, when the token passes through the collector, a "AND" bitwise operation is used (as opposed to the "OR" bitwise for acquisition) between the reservation register (unchanged) complemented by 1 (inversion of all bits) and the synchronization register. Thus, each bit corresponding to an inspector identified by the reservation register is set to "0". The epilogue, executed after the end of the transaction, is very similar to the prologue except that there is no conflict detection. [0016] FIG. 6A illustrates a block diagram of an INSP access conflict detection module (or "inspector") that can be used to implement the protocol described above with reference to FIGS. 4A and 4B. This module includes: - A CTRI controller containing a state machine. This controller forms the interface between the module and the interconnection network RCOMM and manages the different stages of conflict detection, which will be detailed below. - An FHM module containing a set of hardware hash function modules denoted FHM [i]. Each FHM module [i] realizes 15 hardware, a hash function on each address indicative of data to be reserved. Thus, each address is converted by the FHM module into a set of shorter "words" binary which will be used to address the bloom filters FBT and FBS. The size of the outputs of the FHM block is thus dependent on the size of the bloom filters. 20 - A so-called "temporary" FBT Bloom filter which contains n storage elements FBT [0] to FBT [n-1]. Each FBT storage element [i] contains a Boolean value 1 if it has been addressed by the FHM during the current transaction, and has the value 0 otherwise. A FBS counting Bloom filter 25 having a number n of storage elements equal to that of the temporary FBT Bloom filter. The storage elements are denoted FBS [0] to FBS [n-1]. They will be updated from the FBT if successful after the acquisition of all addresses involved in the transaction. Each FBS storage element [i] contains the number of times that FBT [i] was set to 1 for all transactions previously authorized to execute. This module keeps this information for all current transactions. A comparator CMP takes as input the FBS storage elements selected by the FHM for each address indicative of a datum to be reserved. It checks that at least one of the storage elements is worth O. In the positive case, ie if there is at least one 0, it means that the address is available. In the negative case, ie if there is no storage element at 0, it means that the address can not be reserved because it is probably already reserved (only "probably" because it there is a risk of false positive). The requesting transaction must then be canceled. In all cases the corresponding information is returned to the CTR controller. [0017] For the record, a Bloom filter is a compact probabilistic data structure well known in the state of the art, used to perform tests for the presence of an element in a set. A Bloom filter makes it possible to determine with certainty the absence of the element in the set (there can not be false negatives), and probabilistically the presence of said element in the set (a non-negative rate is allowed). no false positives). The size of a Bloom filter does not depend on the number of elements contained in the set, which allows a great compactness; however, a compromise must be found between the size of the filter and the rate of false positives that can be tolerated. In the case of the so-called "ordinary" FBT Bloom filter, they are binary storage elements, whereas in the case of the FBS "counting" Bloom filter they are counters capable of storing integer values ( nothing prevents having two counting Bloom filters, one of which is used as an ordinary filter). For the implementation of the invention, "material" Bloom filters are preferred, that is to say made by means of dedicated logic circuits. The CTR controller is in charge of controlling the different stages of the detection of conflicts described hereinafter and repeated for each address involved in the current transaction during the reservation request: Step 1: a memory address involved in the transaction is presented to the controller CTR via the communication network RCOMM following a sending by a collector. Step 2: the value of the address is propagated to the module 5 FHM. This module returns a set of indices that each identify a storage element in the Bloom FBT and FBS filters. - Step 3: the FBT storage elements identified by these indices take a value "1", the others retain their values. Step 4: The contents of the FBS storage Bloom filter storage elements identified by the FHM module are transmitted to the comparator CMP. Step 5: The comparator CMP uses the values sent by the FBS storage Bloom filter to look for the presence of a conflict. A conflict is identified if none of the received values is O. Whether there is a conflict or not, the corresponding information is returned to the CTR. Step 6: the presence or absence of a conflict and / or overflow is notified by the controller CTR to the collector module COLL. Step 7: When all the addresses involved in the current transaction have failed to identify a conflict, the FBS module 20 is updated from the FBT module. The content of the FBS module becomes the sum of the contents of the FBT module with itself. In the event of an overflow of the accumulated value on FBS, this event is sent to the CTR which itself notifies the COLL to cancel the transaction. The content of the FBS is then maintained at its old value. When all the addresses of the reservation request have been processed and no conflict and overflow has been detected, the requesting transaction is allowed to execute, as explained above. If the controller CTR receives a validation message from the COLL module via the network RCOMM, a logic circuit CMJ (an adder) 30 updates the state of the reservations by incrementing the counters of the backup Bloom filter whose indices correspond one bit to '1' in the temporary Bloom filter, as shown in Figure 6B. Conversely, if dropped, the backup Bloom filter remains unchanged. In both cases, the temporary Bloom filter is reset. The release of the data at the end of the epilogue of the transaction 5 is done in a similar manner but without detection of conflicts, the counters of the backup Bloom filter being decremented instead of being incremented. Bloom's filter properties imply that all access conflicts will be detected (no false negatives), but some transactions will be prevented even in the absence of real access conflicts 10 (possibility of false positives) . In a manner known per se, the various hardware modules of the system may be made from dedicated logic circuits and / or programmable components such as FPGAs. The invention has been described with reference to a particular embodiment. However, many variations can be envisaged. The following list is not exhaustive. According to a first alternative embodiment, as has been mentioned above, there can be only one access conflict detection module (inspector) which, therefore, manages the entire memory space Addressable by the different processors. In this case, the acquisition modules no longer need reservation registers. A form of synchronization must however be maintained to prevent two collectors simultaneously access this single inspector module; this synchronization can, for example, be provided by a binary token. According to a second alternative embodiment, the acquisition modules may not have a temporary storage memory MFA. However, in the absence of this submodule, the processor must return the addresses of the transaction data at each step because they are no longer stored in the acquisition module. According to a third alternative embodiment, the step of releasing the access conflict detection modules before the end of the prologue and the epilogue can be omitted. This variant is particularly relevant in the case of a large multiprocessor architecture to reduce the waiting time. According to a fourth alternative embodiment, each conflict detection module may contain two pairs of Bloom filters: an FBTL pair, FBSL for the read data and a FBTE, FBSE for the written data (see Figure 7). Information in the reservation messages transmitted from the processor to the acquisition module, and from the latter to the conflict detection module, makes it possible to distinguish read and write accesses. The detection of conflicts is then more precise. A conflict must then be detected only if at least one write during a transaction takes place at an address read by other transactions. For the reservation of the access to a data in writing, there is conflict as soon as the address of the data is present in the filter FBSE or FBSL. For reservation of access to read data, there is conflict only when the data address is present in the FBSE filter. According to a fifth alternative embodiment, illustrated in FIG. 8, both the temporary Bloom filter FBT 'and the backup Bloom filter FBS may be of the counting type. The counters of the temporary Bloom filter FBT 'are incremented as the messages of the first type are received, thus at the end of the reservation step this filter contains a vector of integer values constituting a digital signature of the transaction. . Upon receipt of a transaction validation message (fourth type), these integer values are added to the corresponding counters of the FBS Backup Bloom filter. During the data release phase, the release messages (third type) directly decrement the counters of the FBS backup Bloom filter, without passing through the temporary Bloom filter FBT '. There is therefore no need for a confirmation message (fourth type) during said release phase. [0018] Thus, in this embodiment, a reduction in the number of messages exchanged and obtained at the cost of a slight increase in the hardware complexity of the access conflict detection modules. These different alternative embodiments can be combined with each other. In particular, in the fifth alternative embodiment, each Bloom filter can be replaced by a pair of Bloom filters dedicated to read and write accesses (see the fourth alternate embodiment and Figure 7). In addition, Bloombinary or counting filters are not the only probabilistic data structures that can be used in the context of the invention. Another example of such a probabilistic data structure is the "Count-min-sketch". Like Bloom's filters, it uses hash functions; however, its role is to count the occurrence of an element in a stream, not the membership of an element to a set. To apply a "Count-min-sketch" to a transactional memory, proceed as follows. First, it is considered that the sequence of addresses from the memory access of a transaction constitutes a flow. Then a conflict between two transactions is determined by comparing the occurrences of the addresses in the respective flows by means of a "Count-min-sketch". This structure can be implemented in software or hardware.20
权利要求:
Claims (16) [0001] REVENDICATIONS1. Distributed computing system comprising a plurality of computing units (UC) and at least one shared memory (MP) between said computing units, characterized in that it comprises at least one hardware module for detecting access conflicts ( INSP) of said computing units to said shared memory; said or each said hardware conflict detection module being configured to: - store at least one probabilistic data structure, indicative of all the addresses of said shared memory involved in all of the current transactions; receiving at least one message indicative of an access request, by a said computing unit, to at least one address of said shared memory; Determining, from said probabilistic data structure, whether said address is already involved in a transaction in progress, and transmitting to said calculation unit a message indicating the presence or absence of access conflicts; and - receiving at least one indicative or confirmative message of a reservation or release of at least one said address of said shared memory, and updating said probabilistic data structure so that the reserved addresses and the addresses released are considered, respectively, as being / not being involved in an ongoing transaction. 25 [0002] The distributed computing system of claim 1 wherein said or each said hardware conflict detection module comprises: at least one hardware Bloom filter (FBS) for storing said or each said probabilistic data structure; at least one hardware Bloom filter (FBT) for temporarily storing said or each said probabilistic data structure; at least one hash function module (FHM) for addressing said Bloom filters; and at least one logic circuit (CMP) for generating a said presence or absence of access conflicts message from said or at least one said probabilistic data structure, and to update said at least one said probabilistic data structure following the reception of at least one indicative or confirmative message of a reservation or of a release of at least one said address of said shared memory. [0003] 3. distributed computing system according to one of the preceding claims wherein said or each said hardware module for conflict detection is configured to: - receive at least a first type of message indicative of an access request, by a said computing unit, at least one address of said shared memory; determining, from said probabilistic data structure, whether said address is already involved in a transaction in progress, and transmitting to said calculation unit at least a second type of presence message or absence of access conflicts; Receiving at least a third type of message indicative of a request to release a said shared memory address, which is no longer involved in a transaction and update said probabilistic data structure accordingly; and - receive at least a fourth type of message, indicative of a validation or abandonment of at least one said access or release request and, in case of validation, update said probabilistic data structure Consequently. [0004] The distributed computing system of claim 3 wherein said or each said access conflict detection hardware module comprises: at least one first hardware Bloom filter (FBT) for temporarily storing a probabilistic data structure indicative of at least one memory address indicated by a message of said first type or said third type; at least one second counting type hardware Bloom filter (FBS) for storing said probabilistic data structure indicative of all the addresses of said shared memory involved in one or more current transactions; at least one logic circuit (CMP) for generating a message of said second type by comparison between the probabilistic data structures stored in said first and second material Bloom filters, and for updating said probabilistic data structure stored in said second. a hardware Bloom filter from that temporarily stored in said first Bloom filter after receiving a message of said fourth type, indicative of a validation of a request to access or release an address of said memory shared. [0005] A distributed computing system according to claim 4 wherein said or each said access conflict detection hardware module (INSP) comprises a hardware Bloom filter (FBT) for temporarily storing said or each said probabilistic data structure of the count type. [0006] The distributed computing system according to one of claims 4 or 5 wherein said or each said hardware access conflict detection module comprises a first pair of Bloom filters (FBTL, FBSL) for detecting conflicts of interest. read access and a second pair of Bloom filters (FBTE, FBSE) for detecting write access conflicts. [0007] 7. distributed computing system according to one of the preceding claims comprising a plurality of said access conflict detection hardware modules (INSPO - INSPN), each associated with a segment (PMO - PMN) of said shared memory. [0008] The distributed computing system of claim 7 wherein each computing unit comprises a memory access initiator (PR) module and a hardware acquisition module (COLL) of said access conflict detection hardware modules, each said hardware acquisition module being configured to perform an exclusive and atomic acquisition of the hardware or modules of access conflict detection necessary for the detection of access conflicts for a transaction. [0009] A distributed computing system according to claim 8 wherein each said hardware acquisition module is configured to: - receive at least one message indicative of an access request, by said memory access initiator module, at a address of said shared memory; and identifying, from this message, the access conflict detection hardware module associated with said address; receiving and storing a message indicative of a set of hardware modules for detecting access conflicts in use; then determining, by means of said stored message, whether said identified detection hardware module is in use and: if it is not, updating said indicative message of a set of hardware detection hardware modules. access conflicts in use to indicate that said access conflict detection hardware module is now in use and transmit it, and transmit to said identified detection hardware module said message indicative of a request for access conflict. 'access; otherwise transmitting without changes said indicative message of a set of access conflict detection hardware modules in use; subsequently, again receiving said message indicative of a set of access conflict detection hardware modules in use, updating it to indicate that the associated access conflict detection hardware module (s); at or at the memory addresses implied by said transaction are no longer in use and transmit it; receiving, from at least one access conflict detection hardware module, at least one presence message or absence of access conflicts and, if at least one said message is indicative of the presence of a conflict of access; accessing, transmitting to said one or more hardware modules for detecting access conflicts a message for abandoning or canceling said or each said access request; receiving and storing another message indicative of a set of access conflict detection hardware modules in use; then receiving, from said memory access initiator module, a message indicative of the completion of a transaction involving one or more addresses of said shared memory, updating said message indicative of a set of hardware modules for conflict detection; in-use access stored to indicate that the access conflict detection hardware module (s) associated with the memory address (es) involved by said transaction are no longer in use and transmit it, and transmit to at least one access conflict detection hardware module at least one message indicative of a request to release a said shared memory address, which is no longer involved in a transaction. [0010] 10. distributed computing system according to claim 9 wherein said hardware acquisition modules (COLL) are interconnected by a communication network (RCOMM) implementing an exclusive non-blocking access technique. [0011] 11. distributed computing system according to one of claims 10 wherein said acquisition hardware modules are interconnected by a network having a ring-type logical topology (RA) and are configured to transmit step by step on said network a token carrying said message indicative of a set of hardware modules for detecting access conflicts in use. [0012] 12. distributed computing system according to one of claims 8 to 11 comprising a plurality of tiles (TO - TN), said shared memory and a communication network (RCOMM) connecting said tiles together and said shared memory and at least a said access conflict detection hardware module (INSP), each said tile comprising at least one said computing unit (UC). [0013] 13. A method of using a distributed computing system according to one of the preceding claims, comprising the following steps: a) using a computing unit (CPU) to transmit to at least one hardware conflict detection module; access (INSP) at least one message indicative of a request to access an address of a shared memory; b) using said or each said access conflict detection hardware module to determine, from a respective probabilistic data structure (FBS), indicative of a set of addresses of said shared memory (MP) involved in a transaction in progress, if said address is already involved in a current transaction, and for transmitting a presence message or absence of access conflicts addressed to said calculation unit; c) using said calculation unit to determine, from the presence message (s) or absence of access conflicts received from said or each said detection hardware module, whether a transaction involving said or each address of said shared memory can be performed or not, and to transmit to said or each said detection hardware module at least one message indicative of a reservation or release of at least one said address of said shared memory; and using said or each said detection module to update said probabilistic data structure so that the reserved addresses and the freed addresses are considered, respectively, as being / not being involved in a current transaction. [0014] The method of claim 13, wherein: said distributed computing system comprises a plurality of said computing units, a said shared memory and a plurality of said access conflict detection hardware modules (INSPO-INSPN) , each associated with at least one address range (PMO - PMN) of said shared memory; each said computing unit comprises a memory access initiator module (PR) and a hardware acquisition module (COLL) of the said hardware modules for detecting access conflicts; Wherein said step a) comprises the following operations: a1) using a hardware acquisition module to receive and store a message indicative of a set of access conflict detection hardware modules (INSPs) in use ; a2) using a memory access initiator module (PR) associated with said hardware acquisition module for transmitting to said acquisition hardware module at least one said message indicative of an access request to an address of said shared memory; a3) using said hardware acquisition module to identify the access conflict detection hardware module (s) associated with said or each said memory address; a4) determining, by means of said indicative message of a set of in-use access conflict detection hardware modules (J), whether the access conflict detection hardware module (s) associated with said at least one each said memory address of the transaction are in use and: if they are not, send them said message indicative of an access request, update said message indicative of a set of hardware modules detecting access conflicts in use to indicate that said or each said access conflict detection hardware module (INSP) is now in use and transmit it; otherwise transmitting it without modification, and wherein said step b) comprises the following operations: b1) using said acquisition hardware module to receive at least one access conflict detection hardware module at least one message of presence or absence of access conflicts and, if said or all of said messages are indicative of a lack of access conflicts, to transmit to said access conflict detection hardware module (s) confirmatory messages of said access requests ; b2) using said acquisition hardware module (COLL) to receive a message indicative that a transaction involving one or more addresses of said shared memory has been completed, updating said indicative message of a set of detection hardware modules; in-use access conflicts (J) to indicate that the acquisition hardware module (s) associated with the memory address (es) involved by said transaction are no longer in use and transmit it. [0015] 15. The method of claim 14 wherein said acquisition hardware modules are interconnected by a communication network (RA) implementing a non-blocking exclusive access technique. [0016] 16. The method of claim 15 wherein said acquisition hardware modules are interconnected by a network having a ring-type logical topology, said message indicative of a set of 25 hardware modules for detecting access conflicts in progress. of use (J) being transmitted step by step on said network.
类似技术:
公开号 | 公开日 | 专利标题 EP3129874B1|2018-06-13|Distributing computing system implementing a non-speculative hardware transactional memory and a method for using same for distributed computing EP2839378B1|2017-10-04|System and method for managing cache coherence in a network of processors provided with cache memories EP0109898B1|1987-09-30|Queued temporary data storage unit EP0434483A1|1991-06-26|Processor with plural microprogrammed execution units FR2720531A1|1995-12-01|Shared memory access control lock for multiprocessor system FR2553207A1|1985-04-12|CONTROL SYSTEM FOR NARROW COUPLING MULTI-PROCESSING UNITS FR2953307A1|2011-06-03|MEMORY DIRECT ACCESS CONTROLLER FOR DIRECT TRANSFER OF DATA BETWEEN MEMORIES OF MULTIPLE PERIPHERAL DEVICES EP0166062B1|1989-10-25|Arbitration device for access to a shared resource US9553951B1|2017-01-24|Semaphores in distributed computing environments WO2011067507A1|2011-06-09|System enabling direct data transfers between memories of a plurality of elements of said system EP0435718A1|1991-07-03|Processor with plural microprogrammed units and with a mechanism for anticipated instruction execution EP2666092B1|2015-02-25|Multi-core system and data coherency method EP2726985B1|2021-10-06|Device and method for synchronizing tasks executed in parallel on a platform comprising several calculation units FR3101987A1|2021-04-16|Electronic system level reproducible parallel simulation method implemented by means of a multi-core discrete event simulation computer system WO2012038000A1|2012-03-29|Method for managing tasks in a microprocessor or in a microprocessor assembly US8407420B2|2013-03-26|System, apparatus and method utilizing early access to shared cache pipeline for latency reduction US7836033B1|2010-11-16|Method and apparatus for parallel updates to global state in a multi-processor system FR3107130A1|2021-08-13|Method for managing sampled data shared between several processing units US20200356420A1|2020-11-12|Executing an atomic primitive in a multi-core processor system WO2020016511A1|2020-01-23|Method for accelerating the execution of a single-path program by the parallel execution of conditionally concurrent sequences FR2805372A1|2001-08-24|Write pipelining with global ordering in mulipath multiprocessor systems with a single input-output device, uses bus adapters controlling access to an operation queue which is used to control operation sequence US8639887B2|2014-01-28|Dynamically altering a pipeline controller mode based on resource availability EP1493083A2|2005-01-05|Reconfigurable control system based on hardware implementation of petri graphs FR2745100A1|1997-08-22|FAULT TRANSPARENCY COMPUTER SYSTEM FOR USER APPLICATIONS
同族专利:
公开号 | 公开日 EP3129874B1|2018-06-13| FR3019921B1|2017-08-11| EP3129874A1|2017-02-15| US10416925B2|2019-09-17| US20170017435A1|2017-01-19| WO2015155294A1|2015-10-15|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题 US20090133032A1|2007-11-21|2009-05-21|Stuart David Biles|Contention management for a hardware transactional memory| US20090183159A1|2008-01-11|2009-07-16|Michael Maged M|Managing concurrent transactions using bloom filters| WO2013147898A1|2012-03-30|2013-10-03|Intel Corporation|Tracing mechanism for recording shared memory interleavings on multi-core processors|WO2018130800A1|2017-01-13|2018-07-19|Arm Limited|Partitioning of memory system resources or performance monitoring| US10268379B2|2017-01-13|2019-04-23|Arm Limited|Partitioning of memory system resources or performance monitoring| US10649678B2|2017-01-13|2020-05-12|Arm Limited|Partitioning of memory system resources or performance monitoring| US10664306B2|2017-01-13|2020-05-26|Arm Limited|Memory partitioning| US10664400B2|2017-07-11|2020-05-26|Arm Limited|Address translation cache partitioning| US11243892B2|2017-01-13|2022-02-08|Arm Ltd.|Partitioning TLB or cache allocation| US11256625B2|2019-09-10|2022-02-22|Arm Limited|Partition identifiers for page table walk memory transactions|JP2500101B2|1992-12-18|1996-05-29|インターナショナル・ビジネス・マシーンズ・コーポレイション|How to update the value of a shared variable| US8239633B2|2007-07-11|2012-08-07|Wisconsin Alumni Research Foundation|Non-broadcast signature-based transactional memory| US20100332768A1|2009-06-26|2010-12-30|Microsoft Corporation|Flexible read- and write-monitored and buffered memory blocks| US9411733B2|2011-09-09|2016-08-09|University Of Rochester|Sharing pattern-based directory coherence for multicore scalability | US9292294B2|2012-09-27|2016-03-22|Intel Corporation|Detection of memory address aliasing and violations of data dependency relationships|US10235297B2|2015-11-04|2019-03-19|International Business Machines Corporation|Mechanism for creating friendly transactions with credentials| US10270773B2|2015-11-04|2019-04-23|International Business Machines Corporation|Mechanism for creating friendly transactions with credentials| US10722990B2|2016-09-15|2020-07-28|General Electric Company|Method for installing and removing modularized silencer baffles| US10891307B2|2018-05-31|2021-01-12|Microsoft Technology Licensing, Llc|Distributed data synchronization in a distributed computing system| CN109614220B|2018-10-26|2020-06-30|阿里巴巴集团控股有限公司|Multi-core system processor and data updating method|
法律状态:
2015-04-30| PLFP| Fee payment|Year of fee payment: 2 | 2016-02-26| TP| Transmission of property|Owner name: COMMISSARIAT A L'ENERGIE ATOMIQUE ET AUX ENERG, FR Effective date: 20160121 | 2016-04-28| PLFP| Fee payment|Year of fee payment: 3 | 2017-04-28| PLFP| Fee payment|Year of fee payment: 4 | 2018-04-26| PLFP| Fee payment|Year of fee payment: 5 | 2019-04-29| PLFP| Fee payment|Year of fee payment: 6 | 2021-01-15| ST| Notification of lapse|Effective date: 20201209 |
优先权:
[返回顶部]
申请号 | 申请日 | 专利标题 FR1453216A|FR3019921B1|2014-04-10|2014-04-10|DISTRIBUTED CALCULATION SYSTEM IMPLEMENTING NON-SPECULATIVE MATERIAL TRANSACTIONAL MEMORY AND METHOD OF USE FOR DISTRIBUTED CALCULATION|FR1453216A| FR3019921B1|2014-04-10|2014-04-10|DISTRIBUTED CALCULATION SYSTEM IMPLEMENTING NON-SPECULATIVE MATERIAL TRANSACTIONAL MEMORY AND METHOD OF USE FOR DISTRIBUTED CALCULATION| PCT/EP2015/057733| WO2015155294A1|2014-04-10|2015-04-09|Distributing computing system implementing a non-speculative hardware transactional memory and a method for using same for distributed computing| US15/118,028| US10416925B2|2014-04-10|2015-04-09|Distributing computing system implementing a non-speculative hardware transactional memory and a method for using same for distributed computing| EP15716477.3A| EP3129874B1|2014-04-10|2015-04-09|Distributing computing system implementing a non-speculative hardware transactional memory and a method for using same for distributed computing| 相关专利
Sulfonates, polymers, resist compositions and patterning process
Washing machine
Washing machine
Device for fixture finishing and tension adjusting of membrane
Structure for Equipping Band in a Plane Cathode Ray Tube
Process for preparation of 7 alpha-carboxyl 9, 11-epoxy steroids and intermediates useful therein an
国家/地区
|