Evaluation of the Potential of Convolutional Neural Networks and Random Forests for Multi-Class Segmentation of Sentinel-2 Imagery
Abstract
:1. Introduction
- Signal discrepancies due to varying environmental conditions;
- Degree of land cover fragmentation and direct correspondence to the respective class definition;
- Human activities, climate, landscape morphology, and slope create an inhomogeneous mosaic (land-cover complexity), especially when the study area is larger;
- With respect to the spatial resolution of Sentinel-2 and Landsat, the area depicted by a single pixel is often represented by a mixture of signals referring to diverse classes; in addition, the range of spatial resolution at which an object becomes recognizable (for instance, a building, terrain, or field) varies remarkably (wide range between coarse and fine resolution);
- Clouds and artefacts make apparent the shift from single-image processing to time-series analysis, a fact that, although it resolves some issues, introduces additional complications in terms of computational resources and data analysis;
- Image analysis techniques that fit the technical specifications and the functional methods of a specific sensor cannot be transferred, and adapt, without effort, to different sensing and operational modes;
- There is still a lack of large-scale, time-specific, and multi-purpose annotated data sets, and a strong imbalance in favour of a few dominant classes; moreover, the high cost of field campaigns prevents the frequent update of the validation maps;
- Complex modelling is required to capture the distinct non-linear feature relations due to atmospheric and geometric distortions;
- The learning mechanism of some of the most efficient complex models is not transparent; the theoretical understanding of deep-learning models and the interpretation of feature extraction is an ongoing research process;
- Data-driven approaches, like machine and statistical learning, are extremely dependant on the quality of the input and reference data;
- The processing of the requisite data volume challenges the conventional computing systems in terms of memory and processing power, and invites exceptional treatment by means of cloud-based and high-performance infrastructures (scalability).
2. Background
2.1. INSPIRE
2.2. Spatial Data Infrastructure
3. Materials and Methods
3.1. The INSPIRE TOP10NL
3.2. The Sentinel-2 Imagery
3.3. Remote Sensing and Computational Approaches
- Both types of modelling are non-parametric and capable of learning any sophisticated mapping;
- They are fast and easy to implement, and can handle a very large number of input variables by keeping the risk of overfitting low;
- Hand-crafted feature extraction is substituted by automatic feature generation, a process in which linear and non-linear mappings are applied to values derived either from the original data or from successive mathematical operations to derivative values; this process takes place while solving the optimization problem which builds the relationships between input and reference data;
- Deep neural networks learn efficiently discriminative features through their natural hierarchical structure; in a similar way, the random forest builds decision trees at which the relationship ancestor/descendant is evaluated based on the discriminative power of the original features;
- Convolutional neural networks maintain an inherent capacity to model complex processes via the use of a high number of parameters (provided that a sufficient number of training data is available, as it is the case with the abundance of Earth observation data). Although the random forest is not so demanding in terms of structural parameters, it has been repeatedly proven to be quite efficient to model complex relationships, mainly due to its streamlined weighting of contributing factors and its effective synthesis of the output provided by ensemble modelling.
3.3.1. Convolutional Neural Networks
- Standard CNN: a network configuration consisting of a sequence of convolutional/pooling layers, which receives as input the neighbourhood (for instance, , or ) of the pixel of interest. At the final layer, the class label of the central pixel of this neighbourhood is predicted;
- Fully Convolutional Network (FCN): the network consists of convolutional/pooling layers only. The final layer performs an upsampling operation to reach the input spatial resolution [39];
- U-net: An extension of the FCN architecture where, among the successive layers of the standard contracting network, the pooling operators have been replaced by upsampling operators, thus increasing the resolution of the output. It has been shown experimentally that the model parameters converge to satisfying estimations with very few training samples [40].
3.3.2. Random Forest Classifier
4. Results
- Number of training samples: a high number of model parameters requires an even bigger number of training samples (model complexity), especially when the problem at hand is a multi-class classification (problem complexity);
- Transfer learning: intentionally, we did not make use of pre-trained layers so as to check the capacity of the models to extract discriminative features from the input imagery and also to measure the actual training time. However, we adopted another version of transfer learning by assessing the model performance either on a different type of input imagery (S2 processing levels), or against disparate reference data;
- Hyper-parameters: since the models under consideration are multi-parametric, it is practically impossible to quantify the effect of all the model parameters in an exhaustive way. As it is described in the following sections, we selected the parameters which, in our opinion, play a critical role on the model performance. Parameters such as the window size of the convolution or pooling layers, and the window stride in the case of CNNs, or the minimum number of samples required to exist at a leaf node, and the minimum number of samples required to split an internal node in respect of RF did not have a variable value in our experiments. Such parameters certainly affect the model performance, but we consider them as having the potential to improve the fine-tuning of the model.
4.1. Pre-Processing and Data Handling
4.2. Training Phase
4.3. Evaluation Metrics
4.4. Performance Figures
- Model: The symbolic name “M-L-L-W” of the computational model. The notation is explained as follows: (i) M refers to the classifier type, i.e., RF, CNN, FCN, Unet, or SegNet; (ii) L denotes one of the four groups of the S2 products: L1C(winter), L2A(winter), L1C(spring), L2A(spring); (iii) The first L points at the series of products used for training, and the second L at the products used for testing purposes. When the letter L appears in both positions, it means firstly that the group of products for training and testing is the same, and secondly that the classification results are similar in any of the four cases; also, notation like L1Cmix signifies L1C image blocks taken from both winter and spring seasons; (iv) W signifies the size of the spatial window. Especially for the cases of RF and standard CNN, it means: Input: pixels per band; Output: one class label corresponding to the central input pixel of the spatial window ;
- Training time: the elapsed time in seconds for processing 1M training samples (pixels) and with respect to the number of processing units described in Section 4.5 (hardware specs);
- Prediction time: the elapsed time in seconds for processing 1M testing samples (pixels) and with respect to the number of processing units described in Section 4.5 (hardware specs);
- OA: Overall Accuracy (range [0, 1], best value 1);
- WA: Weighted Accuracy (range [0, 1], best value 1);
- F1-macro: F1-score (macro) (range [0, 1], best value 1);
- F1-weighted: F1-score (weighted) (range [0, 1], best value 1).
4.5. The Computational Platform: JRC Earth Observation Data and Processing Platform (JEODPP)
5. Discussion
- The results confirmed the high modelling capacity of both RFs and CNN variants for end-to-end S2 image segmentation, a fact that agrees with the until now experimental evidence stating that such kind of models are general-purpose accurate learning techniques;
- Basic hyper-parameters of both approaches do not affect the classification performance so much, exhibiting a high level of stability given the existence of adequate number of training data. The categorical entropy loss function in the CNN case and the Gini impurity in the RF case slightly lead to better results. For the deep models Unet, SegNet, and FCN, the rectifier function reduced the computational cost a lot and resulted in better classification, whereas in the case of the less deep topologies of the CNN-wxw models, the hyperbolic tangent as activation function was proven more effective;
- Similarly, the processing level of S2 products (either L1C or L2A) does not have a significant impact on the classification results;
- On the contrary, variability due to different seasonal conditions can only be captured by introducing training samples referring to different sensing time stamps. Analysis and modelling of time-series turns to be key component for such type of tasks. CNNs appear more robust and capture better the data variability (landscape diversity and variable time) and it is due to the high number of parameters they come with;
- In the case of Unet, SegNet, and FCN networks, the spatial size of the input tensor has an impact on the classification results. In lower sizes, a more complex architecture (using a high number of filters) is required, whereas in greater sizes there is a strong need for an increasing number of training samples. For the exercise presented herein, a spatial size of 244 × 244 is near to optimal;
- The built-up class cannot be modelled easily due to the inherent mapping issues already mentioned in Section 3.1. It is confused with the class other. In addition, this happens because of the strong overlapping of the two signal signatures and due to the shortage of training samples pointing at the class built-up. The same phenomenon occurs with the class low-density vegetation and the classes grassland and cropland, especially in the spring season. However, we noticed some exceptional cases like the RF-1x1 which confuses grassland and remaining classes but detects sufficiently the built-up class. In reality, when the class is represented by a high number of training samples as it is the case of water, any type of the tested modelling approaches is proven efficient;
- Pixel mismatching between the model output and the reference layer is sometimes due to the fact that the model response is based on the S2 input and its sensing conditions. Actually, the model output can be seen as an updated version of the reference layer, where there is a trade-off among the properties of both the input and reference layers;
- The simple SegNet approach returns as response a rather coarse segmentation. In a similar way, FCN gives an output having the block effect. Both network configurations require more and finer upsampling layers contracting somehow with the downsampling layers, as it happens in the architectures of Unet and SegNet. This is also evidence that from one side, a deep neural network has the flexibility to model complex problems but from the other side, it requires tedious and long-lasting fine-tuning in order to draw an optimal configuration for the task at hand;
- In most of the cases, RF performs quite well even if it does not make use of the feature localization attribute provided by the convolutional filters. In order to capture the high variability existing in the samples, RF tends to grow long trees, a fact which may lead to overfitting. Keeping shorter the size of the trees results sometimes in a weak modelling. In conclusion, RF inherent capacity has an upper bound that puts a ceiling on the modelling of complex processes.
- The results from the transfer learning exercise show an acceptable level of classification agreement given the geographical and morphological discrepancies of the two countries (Netherlands and Albania). Nevertheless, when the target application is a European land cover map, the exploitation of additional information sources and databases like the BigEarthNet [51] or Corine Land Cover 2018 is imperative;
- For big training sets, CPU or GPU-based parallelization is necessary for both CNNs and RFs; the prediction time remains in a sufficient level. The former fact confirms the need for specialized computational resources and dedicated Earth Observation exploitation platforms. With this work, we demonstrate that JEODPP, the EC JRC high-throughput and high-performance computational platform that is gradually being built, is evolving in the proper direction.
6. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Appendix A
- S2B_MSIL1C_20181010T104019_N0206_R008_T32ULC_20181010T161145.SAFE
- S2B_MSIL1C_20181010T104019_N0206_R008_T31UGU_20181010T161145.SAFE
- S2A_MSIL1C_20181117T105321_N0207_R051_T31UET_20181117T112412.SAFE
- S2A_MSIL1C_20181117T105321_N0207_R051_T31UFU_20181117T112412.SAFE
- S2A_MSIL1C_20181117T105321_N0207_R051_T31UFS_20181117T112412.SAFE
- S2A_MSIL1C_20181117T105321_N0207_R051_T31UFT_20181117T112412.SAFE
- S2A_MSIL1C_20181117T105321_N0207_R051_T31UGS_20181117T112412.SAFE
- S2A_MSIL1C_20181117T105321_N0207_R051_T31UFV_20181117T112412.SAFE
- S2B_MSIL1C_20181212T105439_N0207_R051_T31UES_20181212T112608.SAFE
- S2A_MSIL1C_20181117T105321_N0207_R051_T31UGV_20181117T112412.SAFE
- S2A_MSIL1C_20181117T105321_N0207_R051_T31UGT_20181117T112412.SAFE
- S2B_MSIL2A_20181010T104019_N0209_R008_T32ULC_20181010T171128.SAFE
- S2B_MSIL2A_20181010T104019_N0209_R008_T31UGU_20181010T171128.SAFE
- S2A_MSIL2A_20181117T105321_N0210_R051_T31UET_20181117T121932.SAFE
- S2A_MSIL2A_20181117T105321_N0210_R051_T31UFU_20181117T121932.SAFE
- S2A_MSIL2A_20181117T105321_N0210_R051_T31UFS_20181117T121932.SAFE
- S2A_MSIL2A_20181117T105321_N0210_R051_T31UFT_20181117T121932.SAFE
- S2A_MSIL2A_20181117T105321_N0210_R051_T31UGS_20181117T121932.SAFE
- S2A_MSIL2A_20181117T105321_N0210_R051_T31UFV_20181117T121932.SAFE
- S2B_MSIL2A_20181212T105439_N0211_R051_T31UES_20181218T134549.SAFE
- S2A_MSIL2A_20181117T105321_N0210_R051_T31UGV_20181117T121932.SAFE
- S2A_MSIL2A_20181117T105321_N0210_R051_T31UGT_20181117T121932.SAFE
- S2A_MSIL1C_20180508T104031_N0206_R008_T32ULD_20180508T175127.SAFE
- S2A_MSIL1C_20180508T104031_N0206_R008_T31UGS_20180508T175127.SAFE
- S2A_MSIL1C_20180508T104031_N0206_R008_T32ULC_20180508T175127.SAFE
- S2A_MSIL1C_20180508T104031_N0206_R008_T32ULE_20180508T175127.SAFE
- S2A_MSIL1C_20180508T104031_N0206_R008_T31UFS_20180508T175127.SAFE
- S2B_MSIL1C_20180506T105029_N0206_R051_T31UET_20180509T155709.SAFE
- S2B_MSIL1C_20180506T105029_N0206_R051_T31UFT_20180509T155709.SAFE
- S2B_MSIL1C_20180506T105029_N0206_R051_T31UFU_20180509T155709.SAFE
- S2B_MSIL1C_20180506T105029_N0206_R051_T31UES_20180509T155709.SAFE
- S2B_MSIL1C_20180506T105029_N0206_R051_T31UFV_20180509T155709.SAFE
- S2B_MSIL1C_20180506T105029_N0206_R051_T31UGT_20180509T155709.SAFE
- S2A_MSIL2A_20180508T104031_N0207_R008_T32ULD_20180508T175127.SAFE
- S2A_MSIL2A_20180508T104031_N0207_R008_T31UGS_20180508T175127.SAFE
- S2A_MSIL2A_20180508T104031_N0207_R008_T32ULC_20180508T175127.SAFE
- S2A_MSIL2A_20180508T104031_N0207_R008_T32ULE_20180508T175127.SAFE
- S2A_MSIL2A_20180508T104031_N0207_R008_T31UFS_20180508T175127.SAFE
- S2B_MSIL2A_20180506T105029_N0207_R051_T31UET_20180509T155709.SAFE
- S2B_MSIL2A_20180506T105029_N0207_R051_T31UFT_20180509T155709.SAFE
- S2B_MSIL2A_20180506T105029_N0207_R051_T31UFU_20180509T155709.SAFE
- S2B_MSIL2A_20180506T105029_N0207_R051_T31UES_20180509T155709.SAFE
- S2B_MSIL2A_20180506T105029_N0207_R051_T31UFV_20180509T155709.SAFE
- S2B_MSIL2A_20180506T105029_N0207_R051_T31UGT_20180509T155709.SAFE
- S2A_MSIL1C_20181113T093231_N0207_R136_T34TDL_20181113T100054.SAFE
References
- Treitz, P. Remote sensing for mapping and monitoring land-cover and land-use change. Prog. Plan. 2004, 61, 267. [Google Scholar] [CrossRef]
- Drusch, M.; Bello, U.D.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
- Lu, D.; Weng, Q. A Survey of Image Classification Methods and Techniques for Improving Classification Performance. Int. J. Remote Sens. 2007, 28, 823–870. [Google Scholar] [CrossRef]
- Ball, J.; Anderson, D.; Chan, C.S. A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community. J. Appl. Remote Sens. 2017, 11, 042609. [Google Scholar] [CrossRef]
- Camps-Valls, G.; Tuia, D.; Bruzzone, L.; Benediktsson, J.A. Advances in Hyperspectral Image Classification: Earth Monitoring with Statistical Learning Methods. IEEE Signal Process. Mag. 2014, 31, 45–54. [Google Scholar] [CrossRef] [Green Version]
- Noh, H.; Hong, S.; Han, B. Learning Deconvolution Network for Semantic Segmentation. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV ’15), 7–13 December 2015; IEEE Computer Society: Washington, DC, USA, 2015; pp. 1520–1528. [Google Scholar] [CrossRef] [Green Version]
- Bischke, B.; Helber, P.; Folz, J.; Borth, D.; Dengel, A. Multi-Task Learning for Segmentation of Building Footprints with Deep Neural Networks. arXiv, 2017; arXiv:1709.05932. [Google Scholar]
- Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef] [Green Version]
- Circular No. A-16 Revised. Available online: https://obamawhitehouse.archives.gov/omb/circulars_a016_rev/#2 (accessed on 2 October 2019).
- Craglia, M.; Annoni, A. INSPIRE: An innovative approach to the development of spatial data infrastructures in Europe. In Research and Theory in Advancing Spatial Data Infrastructure Concepts; ESRI Press: Readland, CA, USA, 2007; pp. 93–105. [Google Scholar]
- Williamson, I.; Rajabifard, A.; Binns, A. The role of Spatial Data Infrastructures in establishing an enabling platform for decision making in Australia. In Research and Theory in Advancing Spatial Data Infrastructure Concepts; ESRI Press: Readland, CA, USA, 2007; pp. 121–132. [Google Scholar]
- The Global Monitoring for Environment and Security (GMES) Programme. Available online: https://www.esa.int/About_Us/Ministerial_Council_2012/Global_Monitoring_for_Environment_and_Security_GMES (accessed on 10 February 2019).
- Stoter, J.; van Smaalen, J.; Bakker, N.; Hardy, P. Specifying Map Requirements for Automated Generalization of Topographic Data. Cartogr. J. 2009, 46, 214–227. [Google Scholar] [CrossRef]
- The INSPIRE TOP10NL. Available online: https://www.pdok.nl/downloads?articleid=1976855 (accessed on 10 February 2019).
- GDAL/OGR Contributors. GDAL/OGR Geospatial Data Abstraction Software Library. Open Source Geospatial Foundation. Available online: http://gdal.org (accessed on 10 February 2019).
- Sentinel-2 Products Specification Document. Available online: https://sentinels.copernicus.eu/documents/247904/685211/Sentinel-2-Products-Specification-Document (accessed on 10 February 2019).
- Huang, X.; Zhang, L. An SVM Ensemble Approach Combining Spectral, Structural, and Semantic Features for the Classification of High-Resolution Remotely Sensed Imagery. IEEE Trans. Geosci. Remote Sens. 2013, 51, 257–272. [Google Scholar] [CrossRef]
- Myint, S.W.; Gober, P.; Brazel, A.; Grossman-Clarke, S.; Weng, Q. Per-pixel vs. object-based classification of urban land cover extraction using high spatial resolution imagery. Remote Sens. Environ. 2011, 115, 1145–1161. [Google Scholar] [CrossRef]
- Inglada, J. Automatic recognition of man-made objects in high resolution optical remote sensing images by SVM classification of geometric image features. ISPRS J. Photogramm. Remote Sens. 2007, 62, 236–248. [Google Scholar] [CrossRef]
- Foody, G. Supervised image classification by MLP and RBF neural networks with and without an exhaustively defined set of classes. Int. J. Remote Sens. 2004, 25, 3091–3104. [Google Scholar] [CrossRef]
- Oskar Gislason, P.; Benediktsson, J.; Sveinsson, J. Random Forests for land cover classification. Pattern Recognit. Lett. 2003, 27, 294–300. [Google Scholar] [CrossRef]
- Eisavi, V.; Homayouni, S.; Yazdi, A.M.; Alimohammadi, A. Land cover mapping based on random forest classification of multitemporal spectral and thermal images. Environ. Monit. Assess. 2015, 187, 291. [Google Scholar] [CrossRef] [PubMed]
- Pelletier, C.; Valero, S.; Inglada, J.; Champion, N.; Dedieu, G. Assessing the robustness of Random Forests to map land cover with high resolution satellite image time series over large areas. Remote Sens. Environ. 2016, 187, 156–168. [Google Scholar] [CrossRef]
- Rodriguez-Galiano, V.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote Sens. 2012, 67, 93–104. [Google Scholar] [CrossRef]
- Belgiu, M.; Csillik, O. Sentinel-2 cropland mapping using pixel-based and object-based time-weighted dynamic time warping analysis. Remote Sens. Environ. 2018, 204, 509–523. [Google Scholar] [CrossRef]
- Saini, R.; Ghosh, S.K. Crop Classification on Single Date Sentinel-2 Imagery Using Random Forest and Suppor Vector Machine. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2018, XLII-5, 683–688. [Google Scholar] [CrossRef]
- Puletti, N.; Chianucci, F.; Castaldi, C. Use of Sentinel-2 for forest classification in Mediterranean environments. Ann. Silvic. Res. 2018, 42, 32–38. [Google Scholar] [CrossRef]
- Phan, T.N.; Kappas, M. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors 2017, 18, 18. [Google Scholar] [CrossRef]
- Forkuor, G.; Dimobe, K.; Serme, I.; Tondoh, J.E. Landsat-8 vs. Sentinel-2: Examining the added value of sentinel-2’s red-edge bands to land-use and land-cover mapping in Burkina Faso. GISci. Remote Sens. 2018, 55, 331–354. [Google Scholar] [CrossRef]
- Pirotti, F.; Sunar, F.; Piragnolo, M. Benchmark of Machine Learning Methods for Classification of A Sentinel-2 Image. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, XLI-B7, 335–340. [Google Scholar] [CrossRef]
- Längkvist, M.; Kiselev, A.; Alirezaie, M.; Loutfi, A. Classification and Segmentation of Satellite Orthoimagery Using Convolutional Neural Networks. Remote Sens. 2016, 8, 329. [Google Scholar] [CrossRef]
- Pesaresi, M.; Corbane, C.; Julea, A.; Florczyk, A.J.; Syrris, V.; Soille, P. Assessment of the Added-Value of Sentinel-2 for Detecting Built-up Areas. Remote Sens. 2016, 8, 299. [Google Scholar] [CrossRef]
- Lillesand, T.M. Remote Sensing and Image Interpretation; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2008. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G.E. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Lecun, Y. Generalization and network design strategies. In Connectionism in Perspective; Pfeifer, R., Schreter, Z., Fogelman, F., Steels, L., Eds.; Elsevier: Amsterdam, The Netherlands, 1989. [Google Scholar]
- Haykin, S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Prentice Hall PTR: Upper Saddle River, NJ, USA, 1998. [Google Scholar]
- LeCun, Y.A.; Bottou, L.; Orr, G.B.; Müller, K.R. Efficient BackProp. In Neural Networks: Tricks of the Trade, 2nd ed.; Montavon, G., Orr, G.B., Müller, K.R., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 9–48. [Google Scholar]
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
- Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P.A. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. J. Mach. Learn. Res. 2010, 11, 3371–3408. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning (ICML’15), Lille, France, 6–11 July 2015; Volume 37, pp. 448–456. [Google Scholar]
- Breiman, L. Bagging Predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Gascon, F.; Bouzinac, C.; Thépaut, O.; Jung, M.; Francesconi, B.; Louis, J.; Lonjou, V.; Lafrance, B.; Massera, S.; Gaudel-Vacaresse, A.; et al. Copernicus Sentinel-2A Calibration and Products Validation Status. Remote Sens. 2017, 9, 584. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv, 2014; arXiv:1412.6980. [Google Scholar]
- Witten, I.H.; Frank, E.; Hall, M.A. Data Mining: Practical Machine Learning Tools and Techniques, 3rd ed.; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2011. [Google Scholar]
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; CRC Press: Wadsworth, OH, USA, 1984. [Google Scholar]
- Soille, P.; Burger, A.; Marchi, D.D.; Kempeneers, P.; Rodriguez, D.; Syrris, V.; Vasilev, V. A versatile data-intensive computing platform for information retrieval from big geospatial data. Future Gener. Comput. Syst. 2018, 81, 30–40. [Google Scholar] [CrossRef]
- Sumbul, G.; Charfuelan, M.; Demir, B.; Markl, V. BigEarthNet: A Large-Scale Benchmark Archive for Remote Sensing Image Understanding. arXiv, 2019; arXiv:1902.06148. [Google Scholar]
Class Names in Dutch | Class Names in English | Class Labels | (%) Pixels |
---|---|---|---|
NODATA | NODATA | 0 | 7.91% |
boomgaard, boomkwekerij, fruitkwekerij | low-density vegetation | 1 | 0.92% |
grasland | grassland | 2 | 28.66% |
akkerland, braakliggend | cropland | 3 | 15.51% |
bebouwd gebied | built-up | 4 | 0.11% |
bos:gemengd bos, bos:griend, bos:loofbos, bos:naaldbos | forest | 5 | 7.80% |
overig | other | 6 | 9.66% |
water | water | 7 | 27.27% |
aanlegsteiger, ‘basaltblokken, steenglooiing’, dodenakker, | |||
‘dodenakker met bos’, duin, heide, populieren, | remaining classes | 8 | 2.15% |
spoorbaanlichaam, zand |
Band Number | Band Name | Sentinel-2A | Sentinel-2B | Resolution (meters) | ||
---|---|---|---|---|---|---|
Central Wavelength (nm) | Bandwidth (nm) | Central Wavelength (nm) | Bandwidth (nm) | |||
2 | Blue | 496.6 | 98 | 492.1 | 98 | 10 |
3 | Green | 560.0 | 45 | 559 | 46 | 10 |
4 | Red | 664.5 | 38 | 665 | 39 | 10 |
8 | NIR | 835.1 | 145 | 833 | 45 | 10 |
Model | Training Time (s) * | Prediction Time (s) | OA | WA | F1-Macro | F1-Weighted |
---|---|---|---|---|---|---|
RF-L-L-1x1 | 1061 ± 93 | 6.96 ± 0.19 | 0.75 ± 0.031 | 0.88 ± 0.025 | 0.44 ± 0.023 | 0.70 ± 0.012 |
RF-L-L-3x3 | 2170 ± 121 | 5.49 ± 0.11 | 0.81 ± 0.003 | 0.92 ± 0.006 | 0.57 ± 0.009 | 0.80 ± 0.007 |
RF-L-L-5x5 | 5418 ± 224 | 6.06 ± 0.06 | 0.79 ± 0.052 | 0.91 ± 0.026 | 0.53 ± 0.064 | 0.77 ± 0.051 |
CNN-L-L-5x5 | 1.88 ± 1.13 | 0.08 ± 0.03 | 0.84 ± 0.009 | 0.92 ± 0.012 | 0.69 ± 0.017 | 0.84 ± 0.010 |
CNN-L-L-15x15 | 5.66 ± 1.41 | 0.22 ± 0.04 | 0.84 ± 0.013 | 0.92 ± 0.011 | 0.70 ± 0.012 | 0.84 ± 0.013 |
FCN-L-L-244x244 | 328 ± 0.56 | 0.78 ± 0.19 | 0.72 ± 0.021 | 0.88 ± 0.019 | 0.52 ± 0.024 | 0.72 ± 0.016 |
SegNet-L-L-244x244 | 164 ± 0.35 | 1.21 ± 0.16 | 0.86 ± 0.027 | 0.95 ± 0.012 | 0.71 ± 0.011 | 0.88 ± 0.013 |
Unet-L-L-122x122 | 57 ± 23 | 0.23 ± 0.06 | 0.68 ± 0.013 | 0.88 ± 0.018 | 0.56 ± 0.021 | 0.78 ± 0.019 |
Unet-L-L-244x244 | 75 ± 20 | 0.46 ± 0.09 | 0.85 ± 0.015 | 0.94 ± 0.010 | 0.68 ± 0.026 | 0.85 ± 0.015 |
Unet-L-L-366x366 | 98 ± 5 | 1.61 ± 0.11 | 0.83 ± 0.022 | 0.93 ± 0.019 | 0.64 ± 0.020 | 0.82 ± 0.023 |
SegNet-L1C-L2A-244x244 | 0.82 ± 0.009 | 0.92 ± 0.008 | 0.62 ± 0.008 | 0.81 ± 0.010 | ||
Unet-L1Cmix-L2A-244x244 | 0.80 ± 0.012 | 0.92 ± 0.004 | 0.61 ± 0.005 | 0.80 ± 0.005 | ||
RF-L1Cmix-L1Cmix-5x5 | 0.71 ± 0.270 | 0.86 ± 0.013 | 0.44 ± 0.233 | 0.70 ± 0.020 | ||
CNN-L1Cmix-L1Cmix-5x5 | 0.87 ± 0.010 | 0.94 ± 0.003 | 0.73 ± 0.014 | 0.87 ± 0.003 | ||
Unet-L1Cmix-L1Cmix-244x244 | 0.85 ± 0.011 | 0.94 ± 0.005 | 0.69 ± 0.012 | 0.85 ± 0.003 | ||
SegNet-L1Cmix-L1Cmix-244x244 | 0.85 ± 0.009 | 0.94 ± 0.003 | 0.65 ± 0.019 | 0.85 ± 0.005 |
Low-Density Vegetation | Grassland | Cropland | Built-Up | Forest | Other | Water | Remaining Classes | ||
---|---|---|---|---|---|---|---|---|---|
low-density vegetation | 1 | 0.24 | 0.37 | 0.25 | 0 | 0.04 | 0.10 | 0 | 0 |
2 | 0.19 | 0.31 | 0.23 | 0 | 0.11 | 0.16 | 0 | 0 | |
3 | 0.19 | 0.32 | 0.36 | 0 | 0.03 | 0.10 | 0 | 0 | |
4 | 0.11 | 0.28 | 0.31 | 0 | 0.09 | 0.19 | 0.01 | 0.01 | |
5 | 0.11 | 0.39 | 0.37 | 0 | 0.03 | 0.10 | 0 | 0 | |
6 | 0.21 | 0.38 | 0.24 | 0 | 0.06 | 0.11 | 0 | 0 | |
7 | 0.25 | 0.37 | 0.28 | 0 | 0.03 | 0.07 | 0 | 0 | |
grassland | 1 | 0 | 0.85 | 0.07 | 0 | 0.02 | 0.05 | 0.01 | 0 |
2 | 0 | 0.74 | 0.15 | 0 | 0.02 | 0.07 | 0.01 | 0.01 | |
3 | 0 | 0.83 | 0.10 | 0 | 0.02 | 0.04 | 0.01 | 0 | |
4 | 0 | 0.75 | 0.12 | 0 | 0.01 | 0.08 | 0.02 | 0.02 | |
5 | 0 | 0.86 | 0.07 | 0 | 0.02 | 0.04 | 0.01 | 0 | |
6 | 0 | 0.82 | 0.10 | 0 | 0.03 | 0.04 | 0.01 | 0 | |
7 | 0 | 0.82 | 0.10 | 0 | 0.03 | 0.04 | 0.01 | 0 | |
cropland | 1 | 0.01 | 0.18 | 0.80 | 0 | 0 | 0.01 | 0 | 0 |
2 | 0.01 | 0.16 | 0.79 | 0 | 0.01 | 0.03 | 0 | 0 | |
3 | 0 | 0.13 | 0.86 | 0 | 0 | 0.01 | 0 | 0 | |
4 | 0 | 0.18 | 0.77 | 0 | 0.01 | 0.03 | 0.01 | 0 | |
5 | 0 | 0.10 | 0.89 | 0 | 0 | 0.01 | 0 | 0 | |
6 | 0 | 0.21 | 0.77 | 0 | 0 | 0.02 | 0 | 0 | |
7 | 0 | 0.27 | 0.70 | 0 | 0.01 | 0.02 | 0 | 0 | |
built-up | 1 | 0 | 0 | 0 | 0 | 0 | 1.00 | 0 | 0 |
2 | 0 | 0 | 0 | 0 | 0 | 0.98 | 0.02 | 0 | |
3 | 0 | 0 | 0 | 0.19 | 0 | 0.81 | 0 | 0 | |
4 | 0 | 0 | 0 | 0.21 | 0 | 0.77 | 0.02 | 0 | |
5 | 0 | 0 | 0 | 0.28 | 0 | 0.72 | 0 | 0 | |
6 | 0 | 0 | 0 | 0 | 0 | 1.00 | 0 | 0 | |
7 | 0 | 0 | 0 | 0 | 0 | 0.99 | 0.01 | 0 | |
forest | 1 | 0 | 0.11 | 0.01 | 0 | 0.81 | 0.05 | 0 | 0.02 |
2 | 0 | 0.11 | 0.02 | 0 | 0.76 | 0.07 | 0.01 | 0.03 | |
3 | 0 | 0.12 | 0.01 | 0 | 0.81 | 0.04 | 0 | 0.02 | |
4 | 0 | 0.14 | 0.02 | 0 | 0.58 | 0.10 | 0.01 | 0.15 | |
5 | 0 | 0.12 | 0.01 | 0 | 0.82 | 0.04 | 0 | 0.01 | |
6 | 0 | 0.10 | 0.01 | 0 | 0.82 | 0.05 | 0 | 0.02 | |
7 | 0 | 0.20 | 0.03 | 0 | 0.70 | 0.04 | 0.01 | 0.02 | |
other | 1 | 0 | 0.09 | 0.02 | 0 | 0.02 | 0.86 | 0.01 | 0 |
2 | 0 | 0.07 | 0.01 | 0 | 0.02 | 0.89 | 0.01 | 0 | |
3 | 0 | 0.10 | 0.03 | 0 | 0.02 | 0.84 | 0.01 | 0 | |
4 | 0 | 0.06 | 0.01 | 0 | 0.01 | 0.89 | 0.02 | 0.01 | |
5 | 0 | 0.10 | 0.03 | 0 | 0.01 | 0.85 | 0.01 | 0 | |
6 | 0 | 0.09 | 0.02 | 0 | 0.02 | 0.86 | 0.01 | 0 | |
7 | 0 | 0.25 | 0.06 | 0 | 0.03 | 0.65 | 0.01 | 0 | |
water | 1 | 0 | 0.02 | 0 | 0 | 0 | 0.02 | 0.96 | 0 |
2 | 0 | 0.02 | 0.01 | 0 | 0 | 0.02 | 0.95 | 0 | |
3 | 0 | 0.02 | 0 | 0 | 0 | 0.02 | 0.96 | 0 | |
4 | 0 | 0.01 | 0.01 | 0 | 0 | 0.02 | 0.95 | 0.01 | |
5 | 0 | 0.02 | 0 | 0 | 0 | 0.01 | 0.97 | 0 | |
6 | 0 | 0.02 | 0 | 0 | 0 | 0.02 | 0.96 | 0 | |
7 | 0 | 0.04 | 0.01 | 0 | 0 | 0.01 | 0.94 | 0 | |
remaining classes | 1 | 0 | 0.11 | 0.02 | 0 | 0.13 | 0.07 | 0.04 | 0.63 |
2 | 0 | 0.07 | 0.04 | 0 | 0.14 | 0.11 | 0.04 | 0.60 | |
3 | 0 | 0.10 | 0.02 | 0 | 0.11 | 0.07 | 0.04 | 0.66 | |
4 | 0 | 0.06 | 0.03 | 0 | 0.09 | 0.10 | 0.04 | 0.68 | |
5 | 0 | 0.11 | 0.02 | 0 | 0.11 | 0.07 | 0.03 | 0.66 | |
6 | 0 | 0.10 | 0.02 | 0 | 0.15 | 0.08 | 0.05 | 0.60 | |
7 | 0 | 0.10 | 0.02 | 0 | 0.15 | 0.04 | 0.06 | 0.63 |
Low-Density Vegetation | Grassland | Cropland | Built-Up | Forest | Other | Water | Remaining Classes | ||
---|---|---|---|---|---|---|---|---|---|
low-density vegetation | 8 | 0.09 | 0.38 | 0.37 | 0 | 0.15 | 0.01 | 0 | 0 |
9 | 0.01 | 0.45 | 0.29 | 0 | 0.17 | 0.08 | 0 | 0 | |
10 | 0.25 | 0.40 | 0.14 | 0 | 0.09 | 0.09 | 0.01 | 0.02 | |
11 | 0.13 | 0.48 | 0.13 | 0 | 0.16 | 0.09 | 0 | 0.01 | |
12 | 0.08 | 0.48 | 0.12 | 0 | 0.21 | 0.09 | 0 | 0.02 | |
13 | 0.02 | 0.47 | 0.12 | 0 | 0.25 | 0.11 | 0.01 | 0.02 | |
14 | 0.11 | 0.50 | 0.13 | 0 | 0.15 | 0.10 | 0 | 0.01 | |
grassland | 8 | 0 | 0.79 | 0.11 | 0 | 0.03 | 0.05 | 0.02 | 0 |
9 | 0 | 0.81 | 0.10 | 0 | 0.03 | 0.05 | 0.01 | 0 | |
10 | 0 | 0.85 | 0.06 | 0 | 0.03 | 0.04 | 0.01 | 0.01 | |
11 | 0 | 0.86 | 0.07 | 0 | 0.03 | 0.04 | 0 | 0 | |
12 | 0 | 0.86 | 0.07 | 0 | 0.03 | 0.04 | 0 | 0 | |
13 | 0 | 0.81 | 0.09 | 0 | 0.05 | 0.04 | 0 | 0.01 | |
14 | 0 | 0.84 | 0.07 | 0 | 0.03 | 0.05 | 0.01 | 0 | |
cropland | 8 | 0 | 0.27 | 0.68 | 0 | 0.02 | 0.02 | 0.01 | 0 |
9 | 0 | 0.20 | 0.77 | 0 | 0.01 | 0.02 | 0 | 0 | |
10 | 0.01 | 0.27 | 0.69 | 0 | 0.01 | 0.01 | 0.01 | 0 | |
11 | 0 | 0.28 | 0.68 | 0 | 0.01 | 0.03 | 0 | 0 | |
12 | 0 | 0.29 | 0.66 | 0 | 0.01 | 0.03 | 0.01 | 0 | |
13 | 0 | 0.33 | 0.53 | 0 | 0.02 | 0.10 | 0.01 | 0.01 | |
14 | 0 | 0.29 | 0.67 | 0 | 0.01 | 0.03 | 0 | 0 | |
built-up | 8 | 0 | 0 | 0 | 0 | 0 | 0.98 | 0.02 | 0 |
9 | 0 | 0 | 0 | 0.14 | 0 | 0.86 | 0 | 0 | |
10 | 0 | 0 | 0 | 0.13 | 0 | 0.87 | 0 | 0 | |
11 | 0 | 0 | 0.01 | 0 | 0 | 0.96 | 0.03 | 0 | |
12 | 0 | 0 | 0.02 | 0 | 0 | 0.96 | 0.02 | 0 | |
13 | 0 | 0.04 | 0.31 | 0 | 0.03 | 0.45 | 0.16 | 0.01 | |
14 | 0 | 0 | 0.01 | 0 | 0 | 0.97 | 0.02 | 0 | |
forest | 8 | 0 | 0.23 | 0.06 | 0 | 0.63 | 0.04 | 0.01 | 0.03 |
9 | 0 | 0.12 | 0.01 | 0 | 0.80 | 0.05 | 0 | 0.02 | |
10 | 0 | 0.10 | 0.01 | 0 | 0.83 | 0.04 | 0 | 0.02 | |
11 | 0 | 0.13 | 0 | 0 | 0.82 | 0.04 | 0 | 0.01 | |
12 | 0 | 0.13 | 0 | 0 | 0.82 | 0.04 | 0 | 0.01 | |
13 | 0 | 0.17 | 0.02 | 0 | 0.73 | 0.06 | 0 | 0.02 | |
14 | 0 | 0.15 | 0.01 | 0 | 0.79 | 0.04 | 0 | 0.01 | |
other | 8 | 0 | 0.29 | 0.09 | 0 | 0.04 | 0.56 | 0.02 | 0 |
9 | 0 | 0.12 | 0.02 | 0 | 0.03 | 0.82 | 0.01 | 0 | |
10 | 0 | 0.12 | 0.02 | 0 | 0.03 | 0.81 | 0.01 | 0.01 | |
11 | 0 | 0.14 | 0.03 | 0 | 0.04 | 0.78 | 0.01 | 0 | |
12 | 0 | 0.15 | 0.02 | 0 | 0.04 | 0.78 | 0.01 | 0 | |
13 | 0 | 0.18 | 0.22 | 0 | 0.10 | 0.41 | 0.07 | 0.02 | |
14 | 0 | 0.16 | 0.03 | 0 | 0.04 | 0.76 | 0.01 | 0 | |
water | 8 | 0 | 0.05 | 0.01 | 0 | 0 | 0.01 | 0.93 | 0 |
9 | 0 | 0.02 | 0.01 | 0 | 0 | 0.02 | 0.95 | 0 | |
10 | 0 | 0.02 | 0 | 0 | 0 | 0.02 | 0.96 | 0 | |
11 | 0 | 0.02 | 0.01 | 0 | 0 | 0.02 | 0.95 | 0 | |
12 | 0 | 0.02 | 0.01 | 0 | 0 | 0.01 | 0.96 | 0 | |
13 | 0 | 0.02 | 0.02 | 0 | 0.01 | 0.02 | 0.93 | 0 | |
14 | 0 | 0.02 | 0.01 | 0 | 0 | 0.02 | 0.95 | 0 | |
remaining classes | 8 | 0 | 0.14 | 0.03 | 0 | 0.18 | 0.05 | 0.09 | 0.51 |
9 | 0 | 0.11 | 0.02 | 0 | 0.18 | 0.07 | 0.04 | 0.58 | |
10 | 0 | 0.11 | 0.03 | 0 | 0.15 | 0.06 | 0.04 | 0.61 | |
11 | 0 | 0.19 | 0.07 | 0 | 0.17 | 0.18 | 0.04 | 0.35 | |
12 | 0 | 0.23 | 0.07 | 0 | 0.19 | 0.16 | 0.04 | 0.31 | |
13 | 0 | 0.25 | 0.14 | 0 | 0.20 | 0.16 | 0.04 | 0.21 | |
14 | 0 | 0.21 | 0.14 | 0 | 0.16 | 0.18 | 0.04 | 0.27 |
TOP10NL (Based Classes) | Colour | CORINE 2018 Classes |
---|---|---|
1. low-density vegetation | dark blue | 141, 333 |
2. grassland | blue | 321, 322, 323, 324 |
3. cropland | light blue | 211, 212, 213, 221, 222, 223, 231, 241, 242, 243 |
4. built-up | blue-green | 111, 112, 121, 122, 123, 124, 142 |
5. forest | green | 244, 311, 312, 313 |
6. other | orange | 131, 132, 133 |
7. water | red | 511, 512, 521, 522, 523 |
8. remaining classes | brown | 331, 332, 334, 335, 411, 412, 421, 422, 423 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Syrris, V.; Hasenohr, P.; Delipetrev, B.; Kotsev, A.; Kempeneers, P.; Soille, P. Evaluation of the Potential of Convolutional Neural Networks and Random Forests for Multi-Class Segmentation of Sentinel-2 Imagery. Remote Sens. 2019, 11, 907. https://doi.org/10.3390/rs11080907
Syrris V, Hasenohr P, Delipetrev B, Kotsev A, Kempeneers P, Soille P. Evaluation of the Potential of Convolutional Neural Networks and Random Forests for Multi-Class Segmentation of Sentinel-2 Imagery. Remote Sensing. 2019; 11(8):907. https://doi.org/10.3390/rs11080907
Chicago/Turabian StyleSyrris, Vasileios, Paul Hasenohr, Blagoj Delipetrev, Alexander Kotsev, Pieter Kempeneers, and Pierre Soille. 2019. "Evaluation of the Potential of Convolutional Neural Networks and Random Forests for Multi-Class Segmentation of Sentinel-2 Imagery" Remote Sensing 11, no. 8: 907. https://doi.org/10.3390/rs11080907