_images/FigSydneyGraphExample.05.png

paraprobe-toolbox

The paraprobe-toolbox is a collection of open-source tools for efficient analyses of point cloud data where each point can represent atoms or molecular ions. A key application of the toolbox has been for research in the field of Atom Probe Tomography (APT) and related Field Ion Microscopy (FIM). The toolbox does not replace but complements existent software tools in this research field. Given its capabilities of handling points as objects with properties and enabling analyses of the spatial arrangement of and intersections between geometric primitives, the software can equally be used for analyzing data in materials science and engineering.

Capabilities

Each tool of the paraprobe-toolbox is specialized for specific tasks. Examples are the loading of point cloud data from formats of commercial atom probe software, the computing of triangulated surface meshes using convex hulls, alpha shapes, or alpha wrappings, the building of tessellations, the computing of spatial statistics, the computing of iso-surfaces and subsequent identifying of microstructural features, and the correlating of geometric primitives and objects via graph-based analyses.

Efficient and FAIR-embracing

The paraprobe-toolbox is designed for users which like to take advantage of parallelization. To this end the project has delivered how open-source software developments within the field of computer science and computational geometry are successfully made accessible to the atom probe research community enabling users to take advantage of transparent, fast, and robust algorithms. Originally developed in C/C++, the project has recently embraced Python to offer scientists an easier way how they can use the tools in combination with own Python-based workflows. The paraprobe-toolbox uses state-of-the-art libraries like the Computational Geometry Algorithms Library. Supporting users with FAIR research and implementing FAIR data stewardship principles is another recent design priority of paraprobe-toolbox.

NeXus data schemas

With using NeXus as the data format and description tool, a large set of defined data schemas have been formulated which pioneer how common computational workflows in the field of atom probe can be described with clearly defined data schemas. Using these for each tool, makes computations with the paraprobe-toolbox numerically repeatable and the respective input and output files understandable and machine-actionable.

Feel free to utilize the tool. In doing so, feel free to suggest improvements or analysis features which you think would be great to have or help improve the toolbox. This documentation should serve as a guide for using the paraprobe-toolbox.

How to start?

The paraprobe-toolbox is a combination of software tools. Some are written in Python, some are written in C/C++. This requires an installation of the software and its dependencies. We tested the tool successfully on Linux (Ubuntu >=v18) and Windows using the Windows Subsystem for Linux (WSL2). Compiling for a Macintosh computer should be possible. In absence of such computer, we had unfortunately not a chance yet to test this. Let us know if you would like to use paraprobe-toolbox with a Mac.

The latest version of paraprobe-toolbox is v0.4. Using this version is recommended as it enables to profit from clearly defined data schemas, clean HDF5 files with provenance tracking, and data compression by default to make working with large studies more disk-space efficient.

So far the paraprobe-toolbox has to be installed in what is effectively a developer version. A sequence of steps is required. Users should inspect these in the following scripts:

PARAPROBE.Step01.InstallOSDeps.sh
PARAPROBE.Step02.InstallCondaJupyterLab.sh
PARAPROBE.Step03.InstallThirdPartyDeps.sh
PARAPROBE.Step04.BuildTools.sh
PARAPROBE.Step05.Install.Tools.sh
PARAPROBE.Step06.GetExamples.sh

Thanks to Sarath Menon and members of the NFDI-MatWerk consortium there is an experimental conda-forge channel available for the paraprobe-toolbox. This channel still uses an older version of the toolbox (v0.3.1) which we are currently porting to v0.4. Therefore, we strongly advise not to use the conda-forge channel for now. After having everything ported, though, installing the paraprobe-toolbox via conda-forge is the preferred (and much easier way) how to get an installation of the paraprobe-toolbox.

Another approach how users can take advantage of latest research data management software with native support and inclusion of a containerized version of the paraprobe-toolbox (amongst other open-source software tools) is via installing a local instance of the NOMAD Oasis research data management system.

How to use?

The toolbox comes with a collection of examples how the tools can be used for various types of analyses. These examples are available as jupyter-notebooks.

We recommend to start with the beginners examples and specifically explore first the usa_portland_wang.ipynb tutorial before starting to customize existent or build own workflows with the paraprobe-toolbox. These tutorials show that every analysis with a tool has three steps:

  1. Creation of a configuration file using convenience function in Python. The respective tool is called paraprobe-parmsetup. The result will be a NeXus/HDF5 config file. This file includes all settings, time stamps, and hashes of input files.

  2. Run the analysis. This will execute either a Python script or a compiled C/C++ application. Results will be stored in a NeXus/HDF5 results file. This file will include all results, time stamps, hashes, and profiling data. Depending on which tool and tasks are performed, additional XDMF files are generated whereby results can be visualized using ParaView.

  3. Post-process the results using Python for example. For these steps, the paraprobe-autoreporter Python tool of the toolbox offers many convenience functions for generating frequently shown plots.

How to cite?

Users of tools, data schemas, or implementation ideas of the paraprobe-toolbox should cite the GitLab repository of the software and at least one of the scientific paper’s for the toolbox:

License

The paraprobe-toolbox is GPLv3 licensed.

Funding

Markus Kühbach gratefully acknowledges the support from several partners over the years: The support from the Deutsche Forschungsgemeinschaft (DFG) through project BA 4253/2-1. The provisioning of computing resources by the Max-Planck Gesellschaft, and the funding received through BiGmax, the Max-Planck-Research Network on Big-Data-Driven Materials Science. The support from the FAIRmat consortium. FAIRmat is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – project 460197019.

History

The toolbox is developed by Markus Kühbach who is supported by members of the international atom probe community. The project started in autumn 2017 with Andrew Breen from the University at New South Wales. The early development of the project was implemented during a PostDoc stay in Dierk Raabe’s and Baptiste Gault’s group for atom probe tomography at the Max-Planck Institut für Eisenforschung GmbH in Düsseldorf. Later the project was supported by the BiGmax research network of the Max-Planck Society. With moving to the Humboldt-Universität zu Berlin (Department of Physics), the project got more and more support from members of the international atom probe community. With the formation of the German National Research Data Infrastructure (NFDI) and the main developer of the paraprobe-toolbox now supporting the international electron microscopy and atom probe community to build and work towards the implementation and usage of software for FAIR research data management, the paraprobe-toolbox project is developed further. Now the software is used as an open-source plugin amongst other tools in software tools of the FAIRmat and the NFDI-MatWerk consortia of the NFDI.