Dataset Download
Extract the provided zip file in a folder, which results in the folder structure shown under "Dataset Format".
Dataset Format
For each image of the training (1,407 images) and validation set (772 images), we provide the following annotations in the corresponding sub-folders:
- semantics: Pixel-wise semantic masks, where label ids correspond to: background (0), crop (1), weed(2), partial-crop (3), partial-crop (4). Partial crops and weeds have less than 50% visible pixels.
- plant_instances: Pixel-wise instance masks for crops and weed, where ids > 0 correspond to distinct instances.
- leaf_instances: Pixel-wise instance masks for leaves, where ids > 0 correspond to distinct instances.
- plant_visibility: Pixel-wise visibility in range [0,255], where 255 means fully visible.
- leaf_visibility: Pixel-wise visibility in range [0,255], where 255 means fully visible.
We store all annotations as 16-bit png files as these provide a decent lossless compression and can be easily read with off-the-shelf pillow or OpenCV, minimizing the dependencies.
All images and corresponding annotations have an image size of of size 1024 by 1024 pixels. The size was chosen such that even in later growth stages multiple plants are completely inside the image.
We additionally provide the acquisition data, e.g., 05-15
, 05-26
, or
06-05
, at the beginning of every filename, which allows to separate the data based on the date
of data acquisition.
Please, see our devkit providing a Pytorch dataloader that is ready to use with the dataset, but also the baselines implementations or instructions to reproduce our experiments in a separate repository. See our Code page for more information.
Unlabeled Data
For self-supervised pre-training or unsupervised training, we additionally provide a large number of unlabeled images. Here we distinguish patches extracted from the original images and augmented patches extracted from rotated versions.
Patches | Augmented Patches |
---|---|
Example data (149 MB) | Example data (152 MB) |
April 25, 2020 (10 GB) | April 25, 2020 (35.3 GB) |
May 03, 2020 (9.5 GB) | May 03, 2020 (33.2 GB) |
May 15, 2020 (11.9 GB) | May 15, 2020 (37.6 GB) |
May 26, 2020 (10.3 GB) | May 26, 2020 (36.1 GB) |
June 05, 2020 (7.6 GB) | June 05, 2020 (26.5 GB) |
June 12, 2020 (7.2 GB) | June 12, 2020 (26.4 GB) |
July 02, 2020 (6.5 GB) | July 02, 2020 (20.1 GB) |
Dataset License
We distribute the data under Creative Commons
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.
This means that you must attribute the work in the manner specified by the authors and if you alter, transform, or build upon the material for any purpose, even commercially, you may distribute the resulting work only under the same license. |
Specifically you should cite our work (PDF):
@article{weyler2023pami,
author = {Jan Weyler and Federico Magistri and Elias Marks and Yue Linn Chong and Matteo Sodano
and Gianmarco Roggiolani and Nived Chebrolu and Cyrill Stachniss and Jens Behley},
title = {{PhenoBench --- A Large Dataset and Benchmarks for Semantic Image Interpretation
in the Agricultural Domain}},
journal = {IEEE Trans. on Pattern Analysis and Machine Intelligence (T-PAMI)},
year = {2024}
}
We appreciate donations of any amount you feel appropriate from commercial users. Please contact photogrammetry@uni-bonn.de if you want more information.