Landslide susceptibility prediction and identification of its main environmental factors based on machine learning models
-
摘要: 不同机器学习预测滑坡易发性的建模过程及其不确定性有所差异, 另外如何有效识别滑坡易发性的主控因子意义重大。针对上述问题, 以支持向量机(support vector machine, 简称SVM)和随机森林(random forest, 简称RF)为例探讨了基于机器学习的滑坡易发性预测及其不确定性, 创新地提出了"权重均值法"来综合计算出更准确的滑坡主控因子。首先获取陕西省延长县滑坡编录和10类基础环境因子, 将因子频率比值作为SVM和RF的输入变量; 再将滑坡与随机选择的非滑坡样本划分为训练集和测试集, 用训练好的机器学习预测出滑坡易发性并制图; 最后用受试者工作曲线、均值和标准差等来评估建模不确定性, 并计算滑坡主控因子。结果表明: ①机器学习能有效预测出区域滑坡易发性, RF预测的滑坡易发性精度高于SVM, 而其不确定性低于SVM, 但两者的易发性分布规律整体相似; ②权重均值法计算出延长县滑坡主控因子依次是坡度、高程和岩性。实例分析和文献综述显示RF模型相较于其他机器学习模型属于可靠性较高的易发性模型。Abstract: The modelling processes and uncertainties of various machine learning models for landslide susceptibility prediction (LSP) are different, and effectively identifying the main conditioning factors of landslide susceptibility is of great significance. Aiming at these problems, this study aims to discuss the LSP processes and the uncertainties of landslide susceptibility based on machine learning models, namely, support vector machine (SVM) and random forest (RF), and then to innovatively propose the "weighted mean method" for calculating more accurate landslide main control factors. First, the landslide inventories and 10 basic environmental factors of Yanchang County in Shaanxi Province are obtained, and the frequency ratios (FRs) of the environmental factors are taken as the input variables of the SVM and RF models.Then, the landslide and randomly selected nonlandslide samples are divided into model training and testing datasets. Furthermore, the trained RF and SVM models are used to predict the landslide susceptibility and draw the landslide susceptibility prediction (LSP) map.Finally, the uncertainties of LSP modelling are evaluated by the receiver operating characteristic (ROC) curve, mean value and standard deviation, and the main landslide control factors are calculated.The results show that ① Machine learning models can effectively predict the susceptibility of regional landslides. The accuracy of RF in LSP is higher, and its uncertainties are lower than those of SVM. As a whole, the landslide susceptibility distribution rules of the two models are similar.②The main control factors of landslide susceptibility in Yanchang County calculated by the weighted mean method are slope, elevation and lithology.③Case studies and literature reviews show that the RF model is a more reliable susceptibility model than other types of machine learning models.
-
表 1 各基础环境因子的属性区间分级及其频率比值
Table 1. Attribute interval and frequency ratio of each evaluation factor
基础环境因子 变量值 全区栅格/个 栅格比例/% 滑坡栅格/个 滑坡栅格比例/% 频率比 坡向/(°) (连续型) -1 190 0.007 0 0.000 0.000 [0, 22.5) 222 188 8.472 211 6.200 0.732 [22.5, 67.5) 304 138 11.597 576 16.926 1.459 [67.5, 112.5) 484 681 18.482 481 14.135 0.765 [112.5, 157.5) 345 859 13.188 636 18.689 1.417 [157.5, 202.5) 254 226 9.694 430 12.636 1.303 [202.5, 247.5) 333 472 12.716 451 13.253 1.042 [247.5, 292.5) 382 265 14.576 241 7.082 0.486 [292.5, 337.5) 295 463 11.267 377 11.078 0.983 [337.5, 360] 222 188 8.472 211 6.200 0.732 坡度/(°) (连续型) [0, 6.10) 262 663 10.016 38 1.117 0.111 [6.10, 10.45) 420 175 16.022 148 4.349 0.271 [10.45, 14.20) 485 296 18.505 369 10.843 0.586 [14.20, 17.55) 477 773 18.218 638 18.748 1.029 [17.55, 20.70) 417 670 15.927 803 23.597 1.482 [20.70, 23.86) 309 745 11.811 741 21.775 1.844 [23.86, 27.61) 186 825 7.124 540 15.868 2.227 [27.61, 50.29] 62 335 2.377 126 3.703 1.558 平面曲率(连续型) [0, 9.91) 531 824 20.279 1028 30.209 1.490 [9.91, 18.21) 536 386 20.453 938 27.564 1.348 [18.21, 27.48) 415 938 15.860 563 16.544 1.043 [27.48, 37.39) 298 275 11.374 338 9.932 0.873 [37.39, 47.93) 229 817 8.763 194 5.701 0.651 [47.93, 59.12) 188 588 7.191 154 4.525 0.629 [59.12, 70.62) 160 843 6.133 78 2.292 0.374 [70.62, 81.49] 260 811 9.945 110 3.232 0.325 剖面曲率(连续型) [0, 2.46) 518 211 19.760 691 20.306 1.028 [2.46, 4.34) 614 610 23.436 839 24.655 1.052 [4.34, 6.33) 531 797 20.278 736 21.628 1.067 [6.33, 8.33) 377 404 14.391 492 14.458 1.005 [8.33, 10.44) 264 686 10.093 294 8.639 0.856 [10.44, 12.90) 181 130 6.907 223 6.553 0.949 [12.90, 15.95) 99 956 3.812 94 2.762 0.725 [15.95, 29.90] 34 688 1.323 34 0.999 0.755 高程/m (连续型) [473.14, 656.00) 65 566 2.500 0 0.000 0.000 [656.00, 772.04) 162 922 6.213 0 0.000 0.000 [772.04, 866.99) 282 304 10.765 336 9.874 0.917 [866.99, 944.35) 422 849 16.124 1 078 31.678 1.965 [944.35, 1 014.68) 552 133 21.054 1 068 31.384 1.491 [1 014.68, 1 085.01) 551 281 21.021 605 17.778 0.846 [1 085.01, 1 165.89) 395 921 15.097 207 6.083 0.403 [1 165.89, 1 369.84] 189 506 7.226 109 3.203 0.443 NDVI (连续型) [0.054, 0.161) 4 323 0.165 0 0.000 0.000 [0.161, 0.222) 222 528 8.485 346 10.167 1.198 [0.222, 0.248) 347 566 13.253 503 14.781 1.115 [0.248, 0.271) 522 175 19.911 590 17.338 0.871 [0.271, 0.290) 651 814 24.855 837 24.596 0.990 [0.290, 0.316) 726 302 27.695 958 28.152 1.016 [0.316, 0.514) 147 748 5.634 169 4.966 0.881 [0.514, 0.880] 26 0.001 0 0.000 0.000 NDBI (连续型) [0.015, 0.032) 166 0.006 0 0.000 0.000 [0.032, 0.523) 20 500 0.782 3 0.088 0.113 [0.523, 0.550) 144 671 5.517 263 7.728 1.401 [0.550, 0.569) 328 825 12.539 477 14.017 1.118 [0.569, 0.585) 461 789 17.609 517 15.192 0.863 [0.585, 0.601) 638 451 24.345 610 17.925 0.736 [0.601, 0.617) 636 591 24.274 816 23.979 0.988 [0.617, 0.701] 391 489 14.928 717 21.070 1.411 MNDWI (连续型) [0.192, 0.328) 414 411 15.802 749 22.010 1.393 [0.328, 0.356) 732 165 27.919 866 25.448 0.912 [0.356, 0.384) 565 063 21.547 559 16.427 0.762 [0.384, 0.418) 421 031 16.055 479 14.076 0.877 [0.418, 0.455) 292 549 11.155 421 12.371 1.109 [0.455, 0.513) 192 388 7.336 329 9.668 1.318 [0.513, 0.640) 4 737 0.181 0 0.000 0.000 [0.640, 0.981] 138 0.005 0 0.000 0.000 总辐射(连续型) [0, 90) 6 815 0.260 0 0.000 0.000 [90, 170) 68 498 2.612 98 2.880 1.103 [170, 185) 187 607 7.154 456 13.400 1.873 [185, 198) 276 782 10.554 448 13.165 1.247 [198, 211) 339 667 12.952 507 14.899 1.150 [211, 225) 424 230 16.177 589 17.308 1.070 [225, 239) 537 676 20.503 567 16.662 0.813 [239, 255] 781 207 29.789 738 21.687 0.728 岩性(离散型) 泥岩和油页岩 179 598 6.848 0 0.000 0.000 单独泥岩 140 332 5.351 364 10.696 1.999 砂岩与泥岩 134 435 5.126 425 12.489 2.436 石英砂岩 2 231 0.085 0 0.000 0.000 风积和洪积黄土 2 165 886 82.589 2 614 76.815 0.930 表 2 RF和SVM模型易发性图的频率比精度分析
Table 2. Frequency ratio precision analysis of susceptibility graphs of RF and SVM models
模型 易发性等级 全区栅格/个 全区栅格比例/% 滑坡栅格/个 滑坡栅格比例/% 频率比(FR) RF 极低 695 561 26.52 9 0.26 0.010 低 641 183 24.45 39 1.15 0.047 中 575 729 21.95 153 4.50 0.205 高 428 719 16.35 478 14.05 0.859 极高 281 290 10.73 2 724 80.05 7.463 SVM 极低 773 173 29.48 84 2.47 0.084 低 699 525 26.67 346 10.17 0.381 中 388 165 14.80 357 10.49 0.709 高 337 687 12.88 568 16.69 1.296 极高 423 932 16.17 2 048 60.18 3.723 -
[1] 郭子正, 殷坤龙, 唐扬, 等. 库水位下降及降雨作用下麻柳林滑坡稳定性评价与预测[J]. 地质科技情报, 2017, 36(4): 260-265, 270. https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ201704035.htmGuo Z Z, Yin K L, Tang Y, et al. Stability evaluation and prediction of maliulin landslide under reservoir water level decline and rainfall[J]. Geological Science and Technology Information, 2017, 36(4): 260-265, 270(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ201704035.htm [2] 张俊, 殷坤龙, 王佳佳, 等. 三峡库区万州区滑坡灾害易发性评价研究[J]. 岩石力学与工程学报, 2016, 35(2): 284-296. https://www.cnki.com.cn/Article/CJFDTOTAL-YSLX201404018.htmZhang J, Yin K L, Wang J J, et al. Evaluation of landslide susceptibility for Wanzhou district of Three Gorges Reservoir[J]. Chinese Journal of Rock Mechanics and Engineering, 2016, 35(2): 284-296(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-YSLX201404018.htm [3] 郭天颂, 张菊清, 韩煜, 等. 基于粒子群优化支持向量机的延长县滑坡易发性评价[J]. 地质科技情报, 2019, 38(3): 236-243. https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ201903025.htmGuo T S, Zhang J Q, Han Y, et al. Evaluation of landslide susceptibility in Yanchang County based on particle swarm optimization-based support vector machine[J]. Geological Science and Technology Information, 2019, 38(3): 236-243(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ201903025.htm [4] 武雪玲, 杨经宇, 牛瑞卿. 一种结合SMOTE和卷积神经网络的滑坡易发性评价方法[J]. 武汉大学学报: 信息科学版, 2020, 45(8): 1223-1232. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202008013.htmWu X L, Yang J Y, Niu R Q. A Landslide susceptibility assessment method using SMOTE and convolutional neural network[J]. Geomatics and Information Science of Wuhan University, 2020, 45(8): 1223-1232(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202008013.htm [5] Dou J, Yunus A P, Bui D T, et al. Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan[J]. Science of the Total Environment, 2019, 662: 332-346. doi: 10.1016/j.scitotenv.2019.01.221 [6] Tsangaratos P, Ilia I, Hong H Y, et al. Applying information theory and GIS-based quantitative methods to produce landslide susceptibility maps in Nancheng County, China[J]. Landslides, 2017, 14(3): 1091-1111. doi: 10.1007/s10346-016-0769-4 [7] Reichenbach P, Rossi M, Malamud B D, et al. A review of statistically-based landslide susceptibility models[J]. Earth-Science Reviews, 2018, 180: 60-91. doi: 10.1016/j.earscirev.2018.03.001 [8] Zezere J L, Pereira S, Melo R, et al. Mapping landslide susceptibility using data-driven methods[J]. Science of the Total Environment, 2017, 589: 250-267. doi: 10.1016/j.scitotenv.2017.02.188 [9] Zhu L, Huang L H, Fan L Y, et al. Landslide susceptibility prediction modeling based on remote sensing and a novel deep learning algorithm of a cascade-parallel recurrent neural network[J]. Sensors, 2020, 20(6): 1576. doi: 10.3390/s20061576 [10] Chen W, Chen X, Peng J B, et al. Landslide susceptibility modeling based on ANFIS with teaching-learning-based optimization and satin bowerbird optimizer[J]. Geoscience Frontiers, 2021, 12(1): 93-107. doi: 10.1016/j.gsf.2020.07.012 [11] Bui D T, Tsangaratos P, Nguyen V T, et al. Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment[J]. Catena, 2020, 188: 104426. doi: 10.1016/j.catena.2019.104426 [12] 胡涛, 樊鑫, 王硕, 等. 基于逻辑回归模型和3S技术的思南县滑坡易发性评价[J]. 地质科技通报, 2020, 39(2): 113-121. doi: 10.19509/j.cnki.dzkq.2020.0212Hu T, Fan X, Wang S, et al. Landslide susceptibility evaluation of Sinan County using logistics regression model and 3S technology[J]. Bulletin of Geological Science and Technology, 2020, 39(2): 113-121(in Chinese with English abstract). doi: 10.19509/j.cnki.dzkq.2020.0212 [13] 冯杭建, 周爱国, 俞剑君, 等. 浙西梅雨滑坡易发性评价模型对比[J]. 地球科学: 中国地质大学学报, 2016, 41(3): 403-415. https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX201603006.htmFeng H J, Zhou A G, Yu J J, et al. A Comparative study on plum-rain-triggered landslide susceptibility assessment models in west Zhejiang Province[J]. Earth Science: Journal of China University of Geosciences, 2016, 41(3): 403-415(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX201603006.htm [14] Li D Y, Huang F M, Yan L X, et al. Landslide susceptibility prediction using particle-swarm-optimized multilayer perceptron: Comparisons with multilayer-perceptron-only, BP neural network, and information value models[J]. Applied Sciences, 2019, 9(18): 3664. doi: 10.3390/app9183664 [15] Merghadi A, Yunus A P, Dou J, et al. Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance[J]. Earth-Science Reviews, 2020, 7(20): 10325. [16] 郭子正, 殷坤龙, 付圣, 等. 基于GIS与WOE-BP模型的滑坡易发性评价[J]. 地球科学, 2019, 44(12): 4299-4312. https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX201912040.htmGuo Z Z, Yin K L, Fu S, et al. Evaluation of landslide susceptibility based on GIS and WOE-BP model[J]. Earth Science, 2019, 44(12): 4299-4312(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX201912040.htm [17] Huang F M, Cao Z S, Jiang S H, et al. Landslide susceptibility prediction based on a semi-supervised multiple-layer perceptron model[J]. Landslides, 2020, 17(12): 2919-2930. doi: 10.1007/s10346-020-01473-9 [18] 马瑶, 赵江南. 机器学习方法在矿产资源定量预测应用研究进展[J]. 地质科技通报, 2021, 40(1): 132-141. doi: 10.19509/j.cnki.dzkq.2021.0108Ma Y, Zhao J N. Advances in the application of machine learning methods in mineral prospectivity mapping[J]. Bulletin of Geological Science and Technology, 2021, 40(1): 132-141(in Chinese with English abstract). doi: 10.19509/j.cnki.dzkq.2021.0108 [19] 黄发明, 殷坤龙, 张桂荣, 等. 基于相空间重构和小波分析-粒子群向量机的滑坡地下水位预测[J]. 地球科学: 中国地质大学学报, 2015, 40(7): 1254-1265. https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX201507013.htmHuang F M, Yin K L, Zhang G R, et al. Landslide groundwater level time series prediction based on phase space reconstruction and wavelet analysis-support vector machine optimized by PSO algorithm[J]. Earth Science: Journal of China University of Geosciences, 2015, 40(7): 1254-1265(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX201507013.htm [20] Huang F M, Yin K L, Zhang G R, et al. Prediction of groundwater level in landslide using multivariable PSO-SVM model[J]. Journal of Zhejiang University: Engineering Science Edition, 2015, 49(6): 1193-1200. [21] 杨永刚, 殷坤龙, 赵海燕, 等. 基于C5.0决策树-快速聚类模型的万州区库岸段乡镇滑坡易发性区划[J]. 地质科技情报, 2019, 38(6): 189-197. https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ201906023.htmYang Y G, Yin K L, Zhao H Y, et al. Landslide susceptibility evaluation for township units of bank section in Wanzhou district based on C5.0 decision tree and K-means cluster model[J]. Geological Science and Technology Information, 2019, 38(6): 189-197(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ201906023.htm [22] 吴润泽, 胡旭东, 梅红波, 等. 基于随机森林的滑坡空间易发性评价: 以三峡库区湖北段为例[J]. 地球科学, 2021, 46(1): 321-330. https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX202101025.htmWu R Z, Hu X D, Mei H B, et al. Spatial susceptibility assessment of landslides based on random forest: A case study from Hubei section in the Three Gorges Reservoir area[J]. Earth Science, 2021, 46(1): 321-330(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX202101025.htm [23] 张书豪, 吴光. 随机森林与GIS的泥石流易发性及可靠性[J]. 地球科学, 2019, 44(9): 3115-3134. https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX201909025.htmZhang S H, Wu G. Debris flow susceptibility and its reliability based on random forest and GIS[J]. Earth Science, 2019, 44(9): 3115-3134(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DQKX201909025.htm [24] Ahmed B. Landslide susceptibility mapping using multi-criteria evaluation techniques in Chittagong metropolitan area, Bangladesh[J]. Landslides, 2015, 12(6): 1077-1095. [25] Chang Z L, Du Z, Zhang F, et al. Landslide susceptibility prediction based on remote sensing images and GIS: Comparisons of supervised and unsupervised machine learning models[J]. Remote Sensing, 2020, 12(3): 502. [26] Huang Y, Zhao L. Review on landslide susceptibility mapping using support vector machines[J]. Catena, 2018, 165: 520-529. [27] Luo X G, Lin F K, Chen Y H, et al. Coupling logistic model tree and random subspace to predict the landslide susceptibility areas with considering the uncertainty of environmental features[J]. Scientific Reports, 2019, 9(1): 15369. [28] 赵忠国, 张峰, 郑江华. 多元自适应回归样条法的滑坡敏感性评价[J]. 武汉大学学报: 信息科学版, 2021, 46(3): 442-450. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202103018.htmZhao Z G, Zhang F, Zheng J H. Evaluation of landslide susceptibility by multiple adaptive regression spline method[J]. Geomatics and Information Science of Wuhan University, 2021, 46(3): 442-450(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202103018.htm [29] 屈新星, 李道安, 何云玲, 等. 基于MaxEnt模型的滑坡易发性评价: 以攀枝花市为例[J]. 水土保持研究, 2021, 28(2): 224-229. https://www.cnki.com.cn/Article/CJFDTOTAL-STBY202102034.htmQu X X, Li D A, He Y L, et al. Evaluation of landslide susceptibility based on MaxEnt model: Taking Panzhihua City as an example[J]. Research of Soil and Water Conservation, 2021, 28(2): 224-229(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-STBY202102034.htm [30] Chang M, Zhou Y, Zhou C, et al. Coseismic landslides induced by the 2018 Mw 6.6 Iburi, Japan, Earthquake: Spatial distribution, key factors weight, and susceptibility regionalization[J]. Landslides, 2021, 18(2): 755-772. [31] Huang F M, Chen J W, Du Z, et al. Landslide susceptibility prediction considering regional soil erosion based on machine-learning models[J]. ISPRS International Journal of Geo-Information, 2020, 9(6): 377. [32] 张玘恺, 凌斯祥, 李晓宁, 等. 九寨沟县滑坡灾害易发性快速评估模型对比研究[J]. 岩石力学与工程学报, 2020, 39(8): 1595-1610. https://www.cnki.com.cn/Article/CJFDTOTAL-YSLX202008009.htmZhang Q K, Ling S X, Li X N, et al. Comparison of landslide susceptibility mapping rapid assessment models in Jiuzhaigou County, Sichuan Province, China[J]. Chinese Journal of Rock Mechanics and Engineering, 2020, 39(8): 1595-1610(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-YSLX202008009.htm [33] Huang F M, Wang Y, Dong Z L, et al. Regional landslide susceptibility mapping based on grey relational degree model[J]. Earth Science, 2019, 44(2): 664-676. [34] Liu W P, Luo X Y, Huang F M, et al. Prediction of soil water retention curve using Bayesian updating from limited measurement data[J]. Applied Mathematical Modelling, 2019, 76: 380-395. [35] 刘坚, 李树林, 陈涛. 基于优化随机森林模型的滑坡易发性评价[J]. 武汉大学学报: 信息科学版, 2018, 43(7): 1085-1091. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201807017.htmLiu J, Li S L, Chen T. Landslide susceptibility assesment based on optimized random forest model[J]. Geomatics and Information Science of Wuhan University, 2018, 43(7): 1085-1091(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201807017.htm [36] Park S, Kim J. Landslide susceptibility mapping based on random forest and boosted regression tree models, and a comparison of their performance[J]. Applied Sciences, 2019, 9(5): 942.