ARTICLE Gentilhomme_COMPAG_2023/IDIAP Towards Smart Pruning: ViNet, a Deep-Learning Approach for Grapevine Structure Estimation Gentilhomme, Théophile Villamizar, Michael Corre, Jérome Odobez, Jean-Marc convolutional network deep learning grapevine pruning plant skeleton precision viticulture vineyard Computers and Electronics in Agriculture 207 107736 0168-1699 2023 https://www.sciencedirect.com/science/article/pii/S0168169923001242 URL https://doi.org/10.1016/j.compag.2023.107736 doi Image and video tools for analysing crop scenes and plants are essential for applying precision agriculture to crop maintenance, harvesting, or pruning. In this paper, we are interested in vine pruning, a task that requires a precise understanding of the vine structure with branch type identification, orientations, and node locations. However, estimating such a structure is highly challenging, given the large variety in grapevine appearances, lighting conditions, viewpoint, the interweaving of branches, occlusions, and the level of details needed. To address these challenges, we propose ViNet: a deep-learning approach for estimating the structure of grapevine, which comprises two main steps: The first one detects nodes and identifies the branch types of the plant, as well as the spatial relation between them, whilst the second one uses the extracted nodes and branches to build a graph, out of which the structure of the grapevine is inferred. In doing so, we make three main contributions: (i) we put forward for the first time a method for automatic segmentation and extraction of the grapevine structure from images; (ii) we propose a novel approach leveraging powerful stacked hourglass network to infer node location and branch types, along with a novel shortest path weighted graph optimization step to extract connections between nodes and infer the structure, allowing to address the problem of having an unknown number of branches in the tree; (iii) we publicly release a dataset of more than 1500 grapevine images fully annotated with the structure information. Extensive experiments on this dataset demonstrate the efficiency of our approach at predicting the structure of a grapevine, achieving a precision and recall for node prediction of 95% and 90%, respectively, as well as ablation studies validating our design choices.