Рет қаралды 37
Presented by: Chris Siegler, Merck and Co., Inc
---------------------------------------------------------------------------------------------------------------------
Enabling Automated End-to-End Chromatographic Data Workflows and Accelerated Data Insights with the Allotrope Simple Model (ASM) Vendor-Neutral Data Format
Today's life sciences R&D laboratory generates vast amounts of diverse data that is acquired with heterogeneous hardware and software solutions supplied by various vendors. The drive towards lab digitization and cloud has made it possible to store these data relatively easily and inexpensively. However, the heterogeneity in vendor data formats and the need to aggregate data from multiple source locations (instrument PCs, fileshares, databases, etc.) makes creating data pipelines to move the data to the cloud challenging. The inconsistent quality of data (context and structure) makes it difficult for scientists to find, access, connect, and consume the data throughout its lifecycle which limits the value of the data. As a result, many scientists today spend significant amounts of time manually transferring and curating data for routine daily uses and as input into advanced modeling and prediction workflows. To address this important problem that extends the timeline to make critical data insights in the life sciences, a vendor-agnostic data standard format called the Allotrope Simple Model (ASM) was applied to an end-to-end laboratory workflow centered around chromatographic data analysis as an example. The ASM format structures chromatographic data from any vendor's software in a standardized JSON format that leverages standardized ontologies to create consistent context. This presentation will demonstrate how ASM standardization of chromatographic data simplifies and improves the scalability of operational activities (data movement, data
transformation, data storage, and data integration) and creates transformative opportunities for more on demand data consumption (visualizations, dashboards, secondary data processing, and feed AI/ML pipelines).