Debris flow susceptibility evaluation of Liangshan Prefecture based on the RSIV-RF model
-
摘要:
针对随机森林(RF)模型进行泥石流易发性评价过程中存在连续型因子依靠主观意识分级、随机选取的非泥石流样本准确度较低等问题,以位于四川西南部的凉山彝族自治州为研究区,提出基于统计学先验模型抽样的随机森林对研究区进行泥石流易发性评价分区。利用累计灾害频率等曲线的相对变化对连续型因子进行分级处理;采用粗糙集理论(RS)和信息量法(Ⅳ)计算加权信息量值,划定极低和低易发性区并从中选择负样本数据。通过袋外误差(OOB)变化曲线确定RF模型的最佳树棵数n_estimators和分裂特征数max_features,随后构建加权信息量-随机森林(RSIV-RF)模型预测凉山州泥石流易发性。进一步地,与从全区随机选择非泥石流样本的RF模型开展对比研究。结果表明,训练集和测试集下RSIV-RF模型的准确度分别为0.89,0.83,且对应的
ROC 曲线的AUC 值分别为0.920,0.895,均高于单独的RF模型;RSIV-RF绘制的泥石流易发性评价图与历史灾害分布较为一致,较高和高易发性等级区域占研究区面积比为18.625%,包含了78.57%的泥石流点。性能评估和易发性统计结果均表明基于RSIV-RF能够解决单独模型存在的非泥石样本采样不准确的问题,其泥石流易发性预测精度更高,在凉山州地区泥石流易发性评价研究中具有较好的适应性。Abstract:Objective In employing the random forest (RF) model for debris flow susceptibility assessment, challenges arose, including subjectivity in classifying continuous factors and the low accuracy of randomly selected nondebris flow samples. Taking Liangshan Yi Autonomous Prefecture in southwestern Sichuan Province as the study area, a random forest based on statistical prior model sampling was proposed to evaluate the debris flow susceptibility in the study area.
Methods Continuous factors are classified by the relative changes in cumulative disaster frequency and other curves. Rough set theory (RS) and the information value method (Ⅳ) were used to calculate the weighted information values, delimit the extremely low- and low-prone areas and selecting the negative sample data. The optimal number of trees n_estimators and the number of feature splits max_features for the RF model were determined from the out-of-bag error (OOB) change curves. Subsequently, a weighted information random forest (RSIV-RF) model was constructed to predict the vulnerability of debris flow in Liangshan Prefecture. Furthermore, a comparative analysis with the RF model randomly selecting non-debris flow samples revealed the superior performance of the RSIV-RF model.
Results The results show that the accuracy of the RSIV-RF model in the training set and the test set is 0.89 and 0.83, respectively, and the
AUC value of the correspondingROC curve is 0.920 and 0.895, respectively, which are higher than that of the RF model alone. The assessment map of debris flow susceptibility drawn by RSIV-RF is consistent with the distribution of historical disasters. The areas with high and higher susceptibility levels account for 18.625% of the study area, including 78.57% of debris flow points.Conclusion The results of the performance evaluation and susceptibility statistics show that RSIV-RF can solve the problem of inaccurate sampling of nondebris samples in a single model, and its prediction accuracy of debris flow susceptibility is higher. It has good adaptability in the study of debris flow susceptibility evaluation in Liangshan Prefecture.
-
表 1 指标相关性系数计算
Table 1. Calculation of the index correlation coefficient
指标因子 降雨量 坡度 距道路距离 距河流距离 坡向 高程 土地利用 NDVI 降雨量 1.000 坡度 0.101 1.000 距道路距离 -0.110 -0.020 1.000 距河流距离 0.003 -0.011 0.340 1.000 坡向 -0.123 -0.113 0.098 -0.125 1.000 高程 -0.486 -0.102 0.447 0.301 0.113 1.000 土地利用 -0.057 0.037 0.140 -0.062 0.115 0.119 1.000 NDVI 0.124 0.304 0.058 0.132 -0.009 -0.044 -0.034 1.000 表 2 指标因子RS权重系数计算
Table 2. Calculation of the index factor RS weight coefficient
指标因子 属性重要度 权重系数 权重大小排序 降雨量 0.065 5 0.104 8 6 坡度 0.093 3 0.149 2 3 距道路距离 0.055 6 0.088 9 7 距河流距离 0.049 6 0.079 4 8 坡向 0.117 1 0.187 3 1 高程 0.103 2 0.165 1 2 土地利用 0.071 4 0.114 3 4 NDVI 0.069 4 0.111 1 5 表 3 指标因子加权信息量计算结果
Table 3. Calculation results of weighted information content of index factors
指标因子 二级状态 信息量值 加权信息量值 降雨量/mm 900 -1.492 5 -0.156 4 950 -2.644 4 -0.277 0 1050 -0.136 6 -0.014 3 1150 0.179 1 0.018 8 >1150 1.549 3 0.162 3 坡度/(°) [0, 15) -0.502 4 -0.075 0 [15, 20) 0.196 1 0.029 3 [20, 25) 0.431 7 0.064 4 [25, 45) -0.272 6 -0.040 7 [45, 60] -0.779 1 -0.116 2 道路距离/m [0, 300) 2.126 7 0.189 0 [300, 700) 0.868 6 0.077 2 [700, 1 000) -0.114 5 -0.010 2 [1 000, 1500) 0.728 4 0.064 7 [1 500, 2 000] -0.344 8 -0.030 6 >2 000 -0.461 9 -0.041 1 河流距离/m [0, 200) 1.631 7 0.129 5 [200, 300) 2.012 7 0.159 7 [300, 600) 1.458 9 0.115 8 [600, 1 000) 1.360 6 0.108 0 [1 000, 1 500) 0.641 1 0.050 9 [1 500, 2 000] 0.081 5 0.006 5 >2 000 -0.539 5 -0.042 8 坡向 -1 0.000 0 0.000 0 N-NE 0°~67.5° -0.379 8 -0.071 1 E-S 67.5°~202.5° 0.099 3 0.018 6 SW-W 202.5°~292.5° 0.161 0 0.030 1 WE-N 292.5°~360° -0.154 7 -0.029 0 归一植被指数(NDVI) 0.1 1.146 0 0.127 3 0.2 1.024 0 0.113 8 0.4 0.564 8 0.062 8 0.6 0.302 3 0.033 6 >0.6 -1.137 3 -0.126 4 土地利用 耕地 0.903 0 0.103 2 林地 -0.696 0 -0.079 5 草地 0.038 5 0.004 4 水域 0.552 9 0.063 2 建筑用地 1.390 8 0.159 0 未利用土地 0.000 0 0.000 0 高程/m 1220 2.090 9 0.345 2 1630 0.993 0 0.163 9 2000 0.798 0 0.131 7 2600 -0.117 7 -0.019 4 3000 -1.321 2 -0.218 1 >3000 -2.608 9 -0.430 7 表 4 RSIV-RF模型易发性分区结果统计
Table 4. Statistics of susceptibility zoning results of the RSIV-RF model
易发性分区 栅格数 占栅格比例/% 泥石流数 泥石流比例/% 灾害密度 极低易发区 36 061 783 54.316 6 2.381 0.044 低易发区 8 153 595 12.281 16 6.349 0.517 中易发区 10 288 057 15.496 32 12.698 0.819 较高易发区 8 104 859 12.208 87 34.524 2.828 高易发区 4 021 161 6.057 111 44.048 7.273 表 5 RF模型和RSIV-RF模型的评价性能统计
Table 5. Evaluation performance statistics of the RF model and RSIV-RF model
数据集 准确率 均方误差 Kappa系数 RF RSIV-RF RF RSIV-RF RF RSIV-RF 训练集 0.87 0.89 0.12 0.10 0.62 0.70 测试集 0.76 0.83 0.23 0.16 0.55 0.63 -
[1] LI C, MA Y, HE Y. Sensitivity analysis of debris flow to environmental factors: A case of Longxi River basin in Dujiangyan, Sichuan Province[J]. The Chinese Journal of Geological Hazard and Control, 2020, 31(5): 32-39. [2] PAOLA R, MAURO R, BRUCE D M, et al. A review of statistically-based landslide susceptibility models[J]. Earth Science Reviews, 2018, 180: 60-91. doi: 10.1016/j.earscirev.2018.03.001 [3] ULRICH K, BENJAMIN G, GHAZANFAR K, et al. GIS-based landslide susceptibility mapping for the 2005 Kashmir-Earthquake region[J]. Geomorphology, 2008, 101(4): 631-642. doi: 10.1016/j.geomorph.2008.03.003 [4] AYKUT A, SERHAT D, FILKRI B. Landslide susceptibility mapping for a landslide-prone area(Findikli, NE of Turkey) by likelihood-frequency ratio and weighted linear combination models[J]. Environmental Geology, 2008, 54(6): 1127-1143. doi: 10.1007/s00254-007-0882-8 [5] KAYASTHA P, DHITAL M R, DE S F. Application of the analytical hierarchy process(AHP) for landslide susceptibility mapping: A case study from the Tinau watershed, westNepal[J]. Computers & Geosciences, 2013, 52: 398-408. [6] SHARAD K G, DERICKS P S, MANOJ T. Selection of weightages for causative factors used in preparation of landslide susceptibility zonation(LSZ)[J]. Geomatics Natural Hazards & Risk, 2018, 9(1): 471-487. [7] 张俊, 殷坤龙, 王佳佳, 等. 三峡库区万州区滑坡灾害易发性评价研究[J]. 岩石力学与工程学报, 2016, 35(2): 284-296.ZHANG J, YIN K L, WANG J J, et al. Evaluation of landslide susceptibility in Wanzhou District of the Three Gorges Reservoir Area[J]. Chinese Journal of Rock Mechanics and Engineering, 2016, 35(2): 284-296. (in Chinese with English abstract) [8] 王念秦, 薛瑶琼, 李少兵, 等. 基于粗糙集理论的泥石流易发性综合评判模型[J]. 水土保持研究, 2014, 21(3): 246-250.WANG N Q, XUE Y Q, LI S B, et al. A comprehensive evaluation model of debris flow susceptibility based on rough set theory[J]. Soil and Water Conservation Research, 2014, 21(3): 246-250. (in Chinese with English abstract) [9] 刘彦花, 叶国华. 基于粗糙集与GIS的滑坡地质灾害风险评估: 以广西梧州为例[J]. 灾害学, 2015, 30(2): 108-114.LIU Y H, YE G H. Landslide geological hazard risk assessment based on rough set and GIS: A case study of Wuzhou, Guangxi[J]. Catastrophology, 2015, 30(2): 108-114. (in Chinese with English abstract) [10] 陈骏骏, 秦胜伍, 李广杰, 等. 基于RS-IVM的吉林省泥石流灾害易发性评价研究[J]. 应用基础与工程科学学报, 2021, 29(6): 1359-1371.CHEN J J, QIN S W, LI G J, et al. Evaluation of debris flow susceptibility in Jilin Province based on RS-IVM[J]. Journal of Applied Basic and Engineering Sciences, 2021, 29(6): 1359-1371. (in Chinese with English abstract) [11] DOU Q, QIN S W, ZHANG Y C, et al. A method for improving controlling factors based on information fusion for debris flow susceptibility mapping: A case study in Jilin Province, China[J]. Entropy, 2019, 21(7): 695. doi: 10.3390/e21070695 [12] 王世宝, 庄建琦, 樊宏宇, 等. 基于频率比与集成学习的滑坡易发性评价: 以金沙江上游巴塘-德格河段为例[J]. 工程地质学报, 2021, 30(3): 817-828.WANG S B, ZHUANG J Q, FAN H Y, et al. Landslide susceptibility evaluation based on frequency ratio and ensemble learning: Taking the Batang-Dege reach in the upper reaches of the Jinsha River as an example[J]. Chinese Journal of Engineering Geology, 2021, 30(3): 817-828(in Chinese with English abstract) [13] 黄发明, 李金凤, 王俊宇, 等. 考虑线状环境因子适宜性和不同机器学习模型的滑坡易发性预测建模规律[J]. 地质科技通报, 2022, 41(2): 44-59. doi: 10.19509/j.cnki.dzkq.2022.0010HUANG F M, LI J F, WANG J Y, et al. Prediction and modeling law of landslide susceptibility considering the suitability of linear environmental factors and different machine learning models[J]. Bulletin of Geological Science and Technology, 2022, 41(2): 44-59. (in Chinese with English abstract) doi: 10.19509/j.cnki.dzkq.2022.0010 [14] 张钟远, 邓明国, 徐世光, 等. 镇康县滑坡易发性评价模型对比研究[J]. 岩石力学与工程学报, 2022, 41(1): 157-171.ZHANG Z Y, DENG M G, XU S G, et al. Comparative study on evaluation models of landslide susceptibility in Zhenkang County[J]. Chinese Journal of Rock Mechanics and Engineering, 2022, 41(1): 157-171. (in Chinese with English abstract) [15] 黄发明, 胡松雁, 闫学涯, 等. 基于机器学习的滑坡易发性预测建模及其主控因子识别[J]. 地质科技通报, 2022, 41(2): 79-90. doi: 10.19509/j.cnki.dzkq.2021.0087HUANG F M, HU S Y, YAN X Y, et al. Machine learning-based landslide susceptibility prediction modeling and identification of main controlling factors[J]. Bulletin of Geological Science and Technology, 2022, 41(2): 79-90. (in Chinese with English abstract) doi: 10.19509/j.cnki.dzkq.2021.0087 [16] 尹留志. 关于非平衡数据特征问题的研究[D]. 合肥: 中国科学技术大学, 2014.YIN L Z. Research on the problem of unbalanced data characteristics[D]. Hefei: University of Science and Technology of China, 2014. (in Chinese with English abstract) [17] 戴福初, 姚鑫, 谭国焕. 滑坡灾害空间预测支持向量机模型及其应用[J]. 地学前缘, 2007, 14(6): 153-159.DAI F C, YAO X, TAN G H. Support vector machine model for landslide disaster spatial prediction and its application[J]. Earth Science Frontiers, 2007, 14(6): 153-159. (in Chinese with English abstract) [18] 徐伟, 郑玄, 欧文, 等. 四川凉山州地质灾害灾情特征与主要致灾类型[J/OL]. 中国地质灾害与防治学报, 2023: 1-12.XU W, ZHENG X, OU W, et al. Characteristics of losses of geological disasters and major disaster types in Liangshan Prefecture, Sichuan Province[J/OL]. The Chinese Journal of Geological Hazard and Control, 2023: 1-12. (in Chinese with English abstract) [19] SAHA A K, GUPTA R P, SARKAR I, et al. An approach for GIS-based statistical landslide susceptibility zonation with a case study in the Himalayas[J]. Landslides, 2005, 2(1): 61-69. doi: 10.1007/s10346-004-0039-8 [20] PAWLAK Z. Rough set approach to knowledge-based decision support[J]. European Journal of Operational Research, 1997, 99(1): 58-57. doi: 10.1016/S0377-2217(96)00383-9 [21] 刘彦花, 叶国华. 基于粗糙集与GIS的滑坡地质灾害风险评估: 以广西梧州为例[J]. 灾害学, 2015, 30(2): 108-114. doi: 10.3969/j.issn.1000-811X.2015.02.021LIU Y H, YE G H. Risk assessment of landslide geological hazards based on rough sets and GIS: Taking Wuzhou, Guangxi as an example[J]. Disaster Science, 2015, 30(2): 108-114. (in Chinese with English abstract) doi: 10.3969/j.issn.1000-811X.2015.02.021 [22] 郎秋玲, 张以晨, 张继权, 等. 基于组合赋权理论的泥石流孕灾因子分析[J]. 灾害学, 2019, 34(1): 68-72.LANG Q L, ZHANG Y C, ZHANG J Q, et al. Analysis of debris flow hazard producing factors based on combination weighting theory[J]. Disaster Science, 2019, 34(1): 68-72. (in Chinese with English abstract) [23] LEO B. Random forests[J]. Machine Learning, 2001, 45(1): 5-32. doi: 10.1023/A:1010933404324 [24] 闫举生, 谭建民. 基于不同因子分级法的滑坡易发性评价: 以湖北远安县为例[J]. 中国地质灾害与防治学报, 2019, 30(1): 52-60.YAN J S, TAN J M. Landslide susceptibility evaluation based on different factor classification methods: Taking Yuanan County, Hubei as an example[J]. Chinese Journal of Geological Hazards and Prevention, 2019, 30(1): 52-60. (in Chinese with English abstract) [25] 李郎平, 兰恒星, 郭长宝, 等. 基于改进频率比法的川藏铁路沿线及邻区地质灾害易发性分区评价[J]. 现代地质, 2017, 31(5): 911-929.LI L P, LAN H X, GUO C B, et al. Zoning evaluation of geological disaster susceptibility along Sichuan-Tibet railway and adjacent areas based on improved frequency ratio method[J]. Geoscience, 2017, 31(5): 911-929. (in Chinese with English abstract) [26] 郭子正, 殷坤龙, 黄发明, 等. 基于滑坡分类和加权频率比模型的滑坡易发性评价[J]. 岩石力学与工程学报, 2019, 38(2): 287-300.GUO Z Z, YIN K L, HUANG F M, et al. Landslide susceptibility evaluation based on landslide classification and weighted frequency ratio model[J]. Journal of Rock Mechanics and Engineering, 2019, 38(2): 287-300. (in Chinese with English abstract) [27] 吴润泽, 胡旭东, 梅红波, 等. 基于随机森林的滑坡空间易发性评价: 以三峡库区湖北段为例[J]. 地球科学, 2021, 46(1): 321-330.WU R Z, HU X D, MEI H B, et al. Evaluation of landslide spatial susceptibility based on random forest: A case study of Hubei section of the Three Gorges Reservoir Area[J]. Earth Science, 2021, 46(1): 321-330. (in Chinese with English abstract) [28] 刘月, 王宁涛, 周超, 等. 基于ROC曲线与确定性系数法集成模型的三峡库区奉节县滑坡易发性评价[J]. 安全与环境工程, 2020, 27(4): 61-70.LIU Y, WANG N T, ZHOU C, el at. Evaluation of landslide susceptibility in Fengjie County, Three Gorges Reservoir Area based on integrated model of ROC curve and deterministic coefficient method[J]. Safety and Environmental Engineering, 2020, 27(4): 61-70. (in Chinese with English abstract) [29] 连志鹏, 徐勇, 付圣, 等. 采用多模型融合方法评价滑坡灾害易发性: 以湖北省五峰县为例[J]. 地质科技通报, 2020, 39(3): 178-186. doi: 10.19509/j.cnki.dzkq.2020.0319LIAN Z P, XU Y, FU S, et al. Evaluation of landslide hazard susceptibility using multi model fusion method: Taking Wufeng County of Hubei Province as an example[J]. Bulletin of Geological Science and Technology, 2020, 39(3): 178-186. (in Chinese with English abstract) doi: 10.19509/j.cnki.dzkq.2020.0319