Materials research generates vast amounts of data that often exist in manufacturer-specific formats with inconsistent terminology, making aggregation, comparison, and reuse difficult. Researchers traditionally spend considerable time on tedious tasks like format conversion, metadata assignment, and characteristics extraction, which can discourage data sharing and hinder data-driven work. This problem is particularly acute as the field increasingly relies on AI-driven materials discovery, which requires high-quality datasets.
To address this challenge, researchers at the National Institute for Materials Science have developed Research Data Express, a highly flexible data management system for materials scientists. Published in Science and Technology of Advanced Materials: Methods, RDE automatically interprets experimental data from raw files and manually inputted measurements, then restructures and stores this information in a format with enhanced readability.
"RDE significantly reduces the burden of routine data processing for researchers and enhances data findability, interoperability, reusability, and traceability," explains Jun Fujima, corresponding author and researcher at NIMS's Materials Data Platform. "We hope this will promote collaborative, data-driven materials research."
The system's core innovation is the "Dataset Template," which defines and directs how data from different types of experiments should be processed. Unlike similar systems that define data formats, RDE's templates allow researchers to configure the system to interpret data from various sources. For example, if a researcher uploads spreadsheets of X-ray measurements from different sources, the Dataset Template can be configured to interpret them, with the system automatically performing advanced analyses and creating visualizations to provide immediate overviews.
Multiple templates can be prepared for different materials research themes, allowing maximum flexibility in data management, and custom templates can be easily prepared by individual researchers when necessary. Many templates have already been prepared and shared among users. "RDE's unique approach allows researchers to freely define data structures tailored to their instruments, while enabling the system to perform massive data structuring and metadata extraction automatically," says Fujima.
Since its launch in January 2023, RDE has demonstrated significant scalability with over 5,000 users across Japan's materials research community. The system has more than 1,900 Dataset Templates for various experimental methods implemented, over 16,000 datasets created, and more than three million data files accumulated. It serves as data infrastructure for major national initiatives, including the Materials Research DX Platform initiative promoted by Japan's Ministry of Education, Culture, Sports, Science and Technology.
The NIMS team has released an open-source software toolkit to encourage use of the system within the research community. The development addresses a critical bottleneck in materials science by creating standardized, AI-ready datasets that could accelerate discoveries in areas ranging from battery technology to semiconductor development. The system's ability to automate data processing while maintaining flexibility represents a significant advancement toward more efficient and collaborative materials research.


