Data management for Neurophysiology at the German INCF Node

Andrey Sobolev (Department Biology II, Ludwig-Maximilians-Universität München, Germany), Philipp Rautenberg (Department Biology II, Ludwig-Maximilians-Universität München, Germany), Jan Benda (Department Biology II, Ludwig-Maximilians-Universität München, Germany), Jan Grewe (Department Biology II, Ludwig-Maximilians-Universität München, Germany), Andreas Herz (Department Biology II, Ludwig-Maximilians-Universität München, Germany), Thomas Wachtler (Department Biology II, Ludwig-Maximilians-Universität München, Germany)

Progress in neuroscience methodology and research is leading to a rapidly growing number of studies and is generating enormous quantities of heterogeneous and complex data from many species, modalities and levels of study, ever increasing at higher levels of granularity. A key element to successfully exploit the full potential of this huge amount of highly diverse data is the integration of brain research with information technology, making it possible to efficiently utilize the collection of data and knowledge along with analysis and modeling. To make data analysis, re-analysis, and data sharing efficient, data together with metadata should be managed and accessed in a unifed and reproducible way, so that the researcher can concentrate on the scientific questions rather than on problems of data management.

At the German INCF Node (G-Node, www.g-node.org) we support an emerging set of activities aiming at developing and applying tools that essentially support efficient research in the field of electrophysiology. In particular, we provide a data management platform where neuroscientists can upload and organize their data for long-term storage, collaboration and analysis. Within the G-Node infrastructure we integrate modern approaches for data- and metadata-management. For metadata, we use a new format, the “Open metaData Markup Language” (odML). This format specifies a hierarchical structure for storing arbitrary meta information as extended key-value pairs, so called properties, which can be logically grouped into sections and subsections. The odML (http://www.gnode.org/projects/odml) defines the format, not the content, so that it is inherently extensible and can be adapted flexibly to the specific requirements of any laboratory. In line with recent proposals by the community, we provide a flexible Data model. To store raw physiological recordings, we adopt the NEO (http://packages.python.org/neo/) objects, which provide common names and concepts to deal with electrophysiological data in a well-structured way, as well as the number of interfaces to extract data from different proprietary formats. We present a set of applications which allow scientists to import, export and exchange data. We demonstrate, how managing data together with metadata at G-Node achieves an organization that reflects the structure of a particular scientific approach and defines data units for analysis and sharing.

In the current world of different technologies used by scientists for scientific research, there is a need to provide different ways of data access in order to easily embed data management into the individual research environment. To fulfill these requirements, besides the standard web-based framework, we present a common application interface for data access, which implements all presented functionality (data upload, data organization etc.). Using a collection of libraries, developed and provided by the G-Node in commonly-used programming languages (e.g. Python), researchers are able to perform computations directly from their analysis programs. Direct access to the platform gives a freedom to select the technology which suits best for the current research.

Such an approach enables interoperation with other neurosciencientific data-hubs and INCF nodes, thus facilitating collaboration, research and scientific progress within the neuroscientific community.

Preferred presentation format: Demo
Topic: General neuroinformatics

Document Actions