About us

What is Global Rice Dataset

Objective

To build a broad, multi-class, high-resolution semantic segmentation dataset for rice crops deep learning studies and applications.

Dataset scale

Includes 3,078 ground-based RGB images collected from 5 countries and 12 different institutions, taken along the whole growth cycle, and covering wide range of genotype-environment-management combinations.

Global RiceSEG is an open-source and free-to-use dataset

With the following features:
Broad multi-classes of six categories
High-Resolution RGB images
Global Diversity in 5 countries and 12 institudes
Full Rice Growth Cycle
Large-scale datasets with 3078 images
Provide baseline benchmarking
Download Dataset

Featured Acticle

Global rice multiclass segmentation dataset (RiceSEG): comprehensive and diverse high-resolution RGB-annotated images for the development and benchmarking of rice segmentation algorithms

Abstract

The development of computer vision-based rice phenotyping techniques is crucial for precision field management and accelerated breeding, which facilitate continuously advancing rice production. Among phenotyping tasks, distinguishing image components is a key prerequisite for characterizing plant growth and development at the organ scale, enabling deeper insights into ecophysiological processes. However, owing to the fine structure of rice organs and complex illumination within the canopy, this task remains highly challenging, underscoring the need for a high-quality training dataset. Such datasets are scarce, both because of a lack of large, representative collections of rice field images and because of the time-intensive nature of the annotation. To address this gap, we created the first comprehensive multiclass rice semantic segmentation dataset, RiceSEG. We gathered nearly 50,000 high-resolution, ground-based images from five major rice-growing countries (China, Japan, India, the Philippines, and Tanzania), encompassing more than 6000 genotypes across all growth stages. From these original images, 3078 representative samples were selected and annotated with six classes (background, green vegetation, senescent vegetation, panicle, weeds, and duckweed) to form the RiceSEG dataset. Notably, the subdataset from China spans all major genotypes and rice-growing environments from northeastern to southern regions. Both state-of-the-art convolutional neural networks and transformer-based semantic segmentation models were used as baselines. While these models perform reasonably well in segmenting background and green vegetation, they face difficulties during the reproductive stage, when canopy structures are more complex and when multiple classes are involved. These findings highlight the importance of our dataset for developing specialized segmentation models for rice and other crops. The RiceSEG dataset is publicly available at...

URL Nikkei News

Dataset Preview

Paired examples

Sponsors

They supported our project

National Key R&D Program of China

No. 2022YFD2300700,
No. 2022YFE0116200,
No. 2021YFD2000105

Academic Sponsor

Young Scientists Fund of the National Natural Science Foundation of China

No. 42201437

Academic Sponsor

Japan Society for the Promotion of Science

No. 22KK0083

Academic Sponsor

Our team

17 Contributors from 13 universities and institutions

NanJing Agricultural University
The University of Tokyo
Indian Institute of Technology
Professor Jayashankar Telangana Agricultural State University
Kyoto University
HuaZhong University of Science and Technology
HuaZhong Agricultural University
The Institute of Agricultural Resources and Regional Planning

Chinese Academy of Agricultural Sciences

Shenzhen Institutes of Advanced Technology

Chinese Academy of Science

ShenYang Agricultural University
JiLin Academy of Agricultural Sciences
Institute of Crop Sciences

Chinese Academy of Science

YuanLongPing High-Tech Agriculture Co., Ltd.

Contact us

Corresponding authors

shouyang.liu<@>njau.edu.cn

guowei<@>g.ecc.u-tokyo.ac.jp

100 XiaoLingWei Street

Nanjing, China