Vietnam’s Durian Disease Dataset: AI’s Answer to Farming’s Million-Dollar Question

In the heart of Vietnam’s durian orchards, a new dataset is set to revolutionize how we detect and diagnose plant diseases, potentially saving farmers millions in crop losses. Tan Nguyen, a Master’s student at Ho Chi Minh City University of Technology (HUTECH) and CEO of BSP Software Services Corporation, has compiled a comprehensive image dataset of ten common durian diseases, captured under real-field conditions. This dataset, published in *Data in Brief*, comprises 5,452 images of durian plant parts, including leaves, flowers, branches, stems, and roots, all affected by various diseases.

The images were captured using an iPhone 14, simulating the typical photography practices of farmers. This approach introduces varied angles, inconsistent lighting, and complex environmental backgrounds, creating a noisy but realistic dataset. “The goal was to create a dataset that reflects real-world conditions,” Nguyen explains. “Farmers don’t have controlled environments when they take photos of their crops. We wanted to capture that reality to make our models more robust.”

Each disease class contains approximately 405–427 raw images, which were manually reviewed and cropped to focus on disease-affected regions. The processed images are provided in PNG format with variable dimensions, resized to 224×224 pixels only during model training. Disease symptoms were verified in collaboration with plant pathologists to ensure accurate classification.

The commercial implications of this dataset are substantial. Durian is a high-value crop, particularly in Southeast Asia, and diseases can lead to significant economic losses. By providing a realistic dataset, Nguyen’s work enables the development of machine learning models that can accurately diagnose diseases under real-world conditions. This is crucial for the creation of mobile or edge-based diagnostic tools, which can be used directly in the field by farmers.

“The potential for this dataset extends beyond durian,” says Nguyen. “It sets a precedent for how we can capture and use real-world data to improve agricultural diagnostics. This could be a game-changer for the entire sector.”

The dataset is publicly available on Mendeley Data, making it accessible for researchers and developers worldwide. As the agriculture sector continues to embrace technology, datasets like this will play a pivotal role in shaping the future of plant disease diagnosis and management. With the lead author, Tan Nguyen, affiliated with BSP Software Services Corporation and Ho Chi Minh City University of Technology (HUTECH), this research underscores the importance of collaboration between academia and industry in driving agricultural innovation.

Scroll to Top
×