KIMBLE

Introduction

KIMBLE (KNIME-based Integrated MetaBoLomics Environment) is a workflow in the KNIME Analytics Platform for NMR-based metabolomics. It runs in a Linux virtual machine that can be easily plugged in to Oracle VirtualBox. An article on KIMBLE has been published in Analytica Chimica Acta.

Demonstration

Features

  • KIMBLE is Free Open-Source Software (FOSS)
  • Import and processing of time-domain 1D and 2D-JRES data
  • Adaptive binning of spectral data
  • Metabolite quantification
  • Results from individual workflow steps can be easily inspected
  • The standard workflow can be changed or extended, also by people with no programming knowledge
  • New workflow branches can be grafted on any point of the old workflow
  • KNIME has a large repository of data processing, analysis and machine learning nodes
  • The Virtual Machine can run on Windows, Mac, and Linux PCs
  • Workflows can be easily shared or archived
  • KNIME allows Python and R scripts to be inserted, created and edited in the workflow
  • By combining the workflow, the data, and the software in a virtual machine, perfect data analysis reproducibility is achieved.
  • Various useful visualization tools

Contact

We would love to know what you think of KIMBLE. Please send an email with your thoughts to A.Verhoeven@lumc.nl. Or follow us on Twitter:  @KIMBLEmbx

System requirements

  • 8 GB RAM
  • 32 GB free disk space (SSD recommended)
  • Relatively recent 64 bit host OS (MS Windows / Intel-based MacOS / Linux)

License

The KIMBLE workflow is released under the 2-clause BSD license. The KNIME Analytics Platform, Xubuntu, and all other tools, libraries, software and artwork contained in the KIMBLEVB virtual machine are distributed with their own specific licenses. Only download the KIMBLEVB virtual machine if you agree with these licenses.

Installation

The KIMBLE image is identical for Windows, Mac and Linux. However, Mac and Linux do not natively support the zip64 format that Windows uses for compressing large files. Therefore KIMBLE for Mac and Linux is distributed as a .tar.gz archive.

Installation instructions

  • Download Oracle VirtualBox
  • Install Oracle VirtualBox
  • Download the KIMBLEVB virtual machine image from here (be patient: it’s big)
  • Extract the folder in the zip file to a convenient place with fast disk access
  • Click on the “Machine” menu in the Virtualbox Manager window, click “Add”
  • Select the “KIMBLEVB_xxxxxx.vbox” file in the extracted folder.
  • The KIMBLE virtual machine can now be launched from the VirtualBox manager window

How to start and quit

  • To start KIMBLE, doubleclick the KIMBLEVB virtual machine on the left of the VirtualBox manager window
  • Doubleclick the KIMBLE icon in the new window that opens
  • Doubleclick KIMBLE in the KNIME Explorer mini window
  • To close all of KIMBLE quickly, close the KIMBLEVB window (unsaved work will be lost)
  • When the “Close Virtual Machine” window appears, select the “Send the shutdown signal” (DO NOT select the “power off the machine” option)
  • When the “Log out kimbler” window appears, select “Shut Down”

Older versions

KIMBLEVB_190208

KIMBLEVB_190125

KIMBLEVB_181115

KIMBLEVB_180918

KIMBLEVB_180725

KIMBLEVB_180413

KIMBLEVB_180412

Description and motivation

Before the development of KIMBLE I did most of most of the NMR metabolomics data processing with self-written scripts in MATLAB. This confronted me with several problems. One was that MATLAB is an expensive closed-source platform. The other was that many scripts were initially conceived as throw-away code. Some of these scripts eventually produced important data. To reproduce the final results of specific projects the outputs of some of these scripts had to be used as inputs of other scripts. Normally to make this work reliably modern software engineering and version management principles need to be applied. However, there is a huge gap between how code for a software application and how code for a data processing workflow are developed. The former is usually developed according to a well-conceived and documented plan, while the latter is written in an ad-hoc fashion, conceiving and implementing new steps depending on the results of the previous steps. The result is a heap of code that is hard to understand even for the author of the code.

There are several initiatives to make data processing for scientific workflows more reproducible. One approach is to use scientific notebooks where data processing code is combined with commentary and graphics. Jupyter/iPython and R Markdown/knitr are the most well-known examples of these. Another approach is scientific workflow managers, such as Taverna and Galaxy, where the workflow is represented as a directed graph. The KNIME Analytics Platform, created and maintained by KNIME AG in Zurich, Switzerland,  is an example of the latter approach. This platform was chosen for the development of KIMBLE, because the KNIME Data Analytics is free open-source software, comes with a large repository of data analysis tools (“nodes”) and can be installed locally. Custom nodes can be programmed in Python or R. Furthermore, by installing KNIME in a virtual machine, KNIME, domain-specific Python/R libraries, and scientific data can be kept together in a single entity, the VM file. Thus the workflow can be easily shared with others, archived, and if necessary re-evaluated later.

References

If you use KIMBLE in one of your publications, please cite:  KIMBLE: a versatile visual NMR metabolomics workbench in KNIME; A. Verhoeven, M. Giera, O. Mayboroda; Anal. Chim. Acta, accepted.

Please also consider citing the other articles that were used in the construction of the KIMBLE workflow. An up-to-date list is given in the KNIME workbench itself. New references will be added as the workflow is extended. If you think that KIMBLE would benefit with the addition of a specific tool, please contact the corresponding author. If we decide to integrate the tool into KIMBLE, the corresponding article will be added to the reference list.

  • J. J. Helmus, C. P. Jaroniec, Journal of Biomolecular NMR 2013, 55, 355–367.
  • F. Dieterle, A. Ross, G. Schlotterbeck, H. Senn, Anal. Chem. 2006, 78, 4281–4290.
  • T. De Meyer, D. Sinnaeve, B. Van Gasse et al., Anal. Chem. 2008, 80, 3783–3790.
  • E. Holmes, O. Cloarec, J. K. Nicholson, J. Proteome Res. 2006, 5, 1313–1320. Implementation by E. Nevedomskaya.
  • E. L. Ulrich, H. Akutsu, J. F. Doreleijers, et al., Nucleic Acids Research 2007, 36, D402–D408.
  • D. S. Wishart, Y. D. Feunang, A. Marcu, A. C. Guo, K. Liang, et al., Nucleic Acids Res. 2018, 46(D1), D608-17
  • S. Bouatra, F. Aziad, R. Mandal, et al., PLOS One, 2013, 8, e73076
  • J. Hastings, G. Owen, A. Dekker, et al., Nucleic Acids Research 2016, 44, D1214-D1219

FAQ

  • My virtual machine refuses to start! The IT departments of many organizations disable virtualization in the computer’s BIOS. Enable it in the BIOS menu, or ask the IT department to do it for you.
  • How do I get new NMR data into KIMBLE?  On Windows PCs, the contents of the C: drive can be accessed by double clicking on the “Host” icon in the KIMBLEVB window. To make other drives than the C: drive accessible, or to make your Mac’s drive accessible, you need to edit the shared folder settings in VirtualBox. Make sure the KIMBLEVB virtual machine is not running. Right-click on the KIMBLEVB virtual machine in the Oracle VM VirtualBox Manager window. Click “settings”. Click “shared folders”. This allows you to edit the list of folders/drives on your host computer that are accessible from within KIMBLEVB. In KIMBLEVB, these folders can be accessed by clicking “File system”, and then by clicking on the folder “media”. The shared folders have names starting with “sf_”.
  • What is the password of the standard user account? The username “kimbler”, the default password is also “kimbler”.
  • How do I change the password for the user account? Click on the little white mouse logo in the upper left corner of the VM window. Click “Settings”. Scroll down to “Users and groups” and click. Click “Change…” on the right of “Password”. You can now change the password.
  • How can I connect to the internet from within KIMBLEVB? Before you do this, it is essential that you change the default password; see above.  To connect to the internet, right-click on the KIMBLEVB virtual machine in the Oracle VM VirtualBox Manager window. Click “settings”. Click “Network”. Click “Advanced”. Check the “Cable connected” box. Click “OK”.
  • My computer is much more powerful than the minimum system requirements. How can I give the virtual machine more resources?  In the VirtualBox manager window, select the KIMBLEVB virtual machine. Doubleclick the “System” box under “Details”. Here you have the option to supply the VM with more memory and/or CPU cores. Be sure to leave some memory for the host OS. If you increase the VM memory, you also have to increase the Java heap space to make the extra memory available to KNIME. To do this, start the KIMBLEVB virtual machine. In the VM, open a file manager, go to /home/kimbler/bin/knime and open the file knime.ini in Leafpad or another text editor. Find the line -Xmx4096m; this means that the current Java heapspace is 4 gigabytes (4096 megabytes). You can change this acoording to the VM memory you just set in the VirtualBox manager. Just be sure to leave some memory for the guest OS; for example, if you give the VM 12 gigabytes of memory in the VirtualBox manager, you can set the Java heapspace to -Xmx11264m.