Open Access BASE2021

Simulation Testbed for Evaluating Distributed Querying and Searching of Mass Spectrometry Big Data in a Network-based Infrastructure

Abstract

Advance access and reuse mechanisms for large-scale Mass Spectrometry (MS) data are essential for democratizing data for the omics research community and making it adhere to FAIR (Findable, Accessible, Interoperable, Reusable) principles. Although a number of centralized data repositories have been established, they have been limited to search mechanisms that depend on the meta-data associated with these MS datasets. Furthermore, they require constant influx of resources for maintenance. In this paper, we proposed an alternative novel distributed infrastructure for direct MS/MS spectral search. We designed and developed a simulation testbed using concepts from computer networks, queuing theory, and stochastic simulation methods. Results show that a distributed MS search based on raw MS/MS spectra can scale gracefully for up-to 2000 participating nodes, while simultaneously processing queries using the proposed networked infrastructure on the order of milliseconds to a few seconds for up-to a total of fifty billion MS/MS spectra.

Problem melden

Wenn Sie Probleme mit dem Zugriff auf einen gefundenen Titel haben, können Sie sich über dieses Formular gern an uns wenden. Schreiben Sie uns hierüber auch gern, wenn Ihnen Fehler in der Titelanzeige aufgefallen sind.