This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. As such, it is in the public domain, and under the provisions of Title 17, United States Code, Section 105, may not be copyrighted. ; IEEE Transactions on Computers, C-39, October 1990, pp. 1298-1304 ; A multiprocessing system is t/s diagnosable if all faulty processors can be identified to within s processors provided there are no more than t faculty processors. A characterization theorem of Karunanithi and Medman for t/s diagnosability in certain special cases of systems called designs is extended to the entire class of D designs. We show that for large.
Every modern CPU uses a complex memory hierarchy, which consists of multiple cache memory levels. It is very difficult to predict the behavior of this hierarchy for a given program (for details see [1, 2]). The situation is even worse for systems with a shared memory. The most important example is the case of SMP (symmetric multiprocessing) systems [3]. The importance of these systems is growing due to the multi-core feature of the newest CPUs.The Cache Emulator (CE) can simulate the behavior of caches inside an SMP system and compute the number of cache misses during a computation. All measurements are done in the "off-line" mode on a single CPU. The CE uses its own emulated cache memory for an exact simulation. This means that no other CPU activity influences the behavior of the CE. This work extends the Cache Analyzer introduced in [4].
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. As such, it is in the public domain, and under the provisions of Title 17, United States Code, Section 105, may not be copyrighted. ; Proceedings of the 26th Annual Allerton Conference on Communication, Control, and Computing, Sept. 1988, regular (full) paper, pp. 408-416 (Unrefereed) ; We propose a distributed disabling algorithm for a multiprocessing system in which each processor or unit is prevented from doing computation when it fails some number of tests by other units. The goal is to disable all faulty units and to enable all fault-free units. Specifically, a unit is disabled if it fails d or more tests by enabled units (d-disabling rule). A multiprocessor system is c-correctable using the d-disabling rule if all faulty units are permanently disabled and all fault-free units are permanently enabled after a finite number of applications of the disabling rule, provided there are no more thna c faulty units. This models an unattended system where the removal of faulty units is done locally by simple and reliable circuitry. We give a sufficient condition for c-correctability in general systems and a necessary and sufficient condition in general systems where c < d. Then, we give necessary and sufficient conditions for c-correctability of two types of systems, (1) complete digraphs and (2) a new class of systems called segmented systems.
Our society is generating an increasing amount of data at an unprecedented scale, variety, and speed. This also applies to numerous research areas, such as genomics, high energy physics, and astronomy, for which large-scale data processing has become crucial. However, there is still a gap between the traditional scientific computing ecosystem and big data analytics tools and frameworks. On the one hand, high performance computing (HPC) programming models lack productivity, and do not provide means for processing large amounts of data in a simple manner. On the other hand, existing big data processing tools have performance issues in HPC environments, and are not general-purpose. In this paper, we propose and evaluate PyCOMPSs, a task-based programming model for Python, as an excellent solution for distributed big data processing in HPC infrastructures. Among other useful features, PyCOMPSs offers a highly productive general-purpose programming model, is infrastructure-agnostic, and provides transparent data management with support for distributed storage systems. We show how two machine learning algorithms (Cascade SVM and K-means) can be developed with PyCOMPSs, and evaluate PyCOMPSs' productivity based on these algorithms. Additionally, we evaluate PyCOMPSs performance on an HPC cluster using up to 1,536 cores and 320 million input vectors. Our results show that PyCOMPSs achieves similar performance and scalability to MPI in HPC infrastructures, while providing a much more productive interface that allows the easy development of data analytics algorithms. ; This work has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement H2020-MSCA-COFUND2016-754433. This work has been supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya, Spain (contract 2014-SGR-1051). The research leading to these results has also received funding from the collaboration between Fujitsu and BSC (Script Language Platform). ; Peer Reviewed ; Postprint (published version)
Rediscovery of buried ideas from the pioneer age of computers -- Structured design for structured architecture -- Computer architectures for the interpretation of high-level languages -- Some aspects of the STARLET project -- A concept for hardwired main storage management -- A virtual memory organization based on a multi-activity drum -- Content addressing in data bases by special peripheral hardware: a proposal called "Suchrechner" -- Multiprocessors and other parallel systems — an introduction and overview -- STARAN: An associative approach to multiprocessor architecture -- Design of a hierarchical multiprocessor system for multilevel parallel computation -- The connection of an associative pipeline with a cache memory -- A processor system for multiprocessing -- A general purpose array with a broad spectrum of applications -- Magnetic bubbles as a computer technology -- On the problem of fast random and sequential data access in shift register memories -- On pipeline realisations of dynamic memories -- A fast access algorithm for cellular dynamic memories.
Zugriffsoptionen:
Die folgenden Links führen aus den jeweiligen lokalen Bibliotheken zum Volltext: