留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于MetaOD模型选择的岩土工程数据异常检测方法

邹彤彤 刘孝义 刘金泉 袁海梁 卢玉斌 张万虎

邹彤彤, 刘孝义, 刘金泉, 袁海梁, 卢玉斌, 张万虎. 基于MetaOD模型选择的岩土工程数据异常检测方法[J]. 地质科技通报, 2022, 41(2): 239-245. doi: 10.19509/j.cnki.dzkq.2022.0041
引用本文: 邹彤彤, 刘孝义, 刘金泉, 袁海梁, 卢玉斌, 张万虎. 基于MetaOD模型选择的岩土工程数据异常检测方法[J]. 地质科技通报, 2022, 41(2): 239-245. doi: 10.19509/j.cnki.dzkq.2022.0041
Zou Tongtong, Liu Xiaoyi, Liu Jinquan, Yuan Hailiang, Lu Yubin, Zhang Wanhu. Outlier detection method for geotechnical engineering based on MetaOD model selection[J]. Bulletin of Geological Science and Technology, 2022, 41(2): 239-245. doi: 10.19509/j.cnki.dzkq.2022.0041
Citation: Zou Tongtong, Liu Xiaoyi, Liu Jinquan, Yuan Hailiang, Lu Yubin, Zhang Wanhu. Outlier detection method for geotechnical engineering based on MetaOD model selection[J]. Bulletin of Geological Science and Technology, 2022, 41(2): 239-245. doi: 10.19509/j.cnki.dzkq.2022.0041

基于MetaOD模型选择的岩土工程数据异常检测方法

doi: 10.19509/j.cnki.dzkq.2022.0041
基金项目: 

国家自然科学基金青年基金项目 51809253

福建省自然科学基金项目 2019J01142

详细信息
    作者简介:

    邹彤彤(1996—),女,现正攻读应用统计专业硕士学位,主要从事岩土工程中数据异常检测工作。E-mail: 17320026010@163.com

    通讯作者:

    刘金泉(1989—),男,副研究员,主要从事岩土工程教学与科研工作。E-mail: jinquanliu99@163.com

  • 中图分类号: X830.3

Outlier detection method for geotechnical engineering based on MetaOD model selection

  • 摘要: 岩土工程现场及室内参数测试数据是工程施工、设计、评价的基础。异常数据的存在往往会误导施工、设计等参数的确定, 数据异常检测是确保工程安全可靠的最基本但极为重要的工作。针对传统异常检测算法没有模型选择这一过程而导致检测的盲目性, 提出了基于元学习的异常检测算法(meta-learning outlier detection, MetaOD)和数据挖掘算法相结合的异常检测模型体系。该体系首先根据数据的特点选择适合不同数据类型的初始模型类型及其参数, 并对选择出的同类型算法的参数进行求均值处理; 然后再采用遴选出的算法进行数据异常诊断, 进而提高异常检测的准确性。为了评估模型的有效性, 采用加州大学欧文分校提出的机器学习检验数据集(glass数据集)进行检验分析。结果显示, 采用该模型体系进行异常检测时查准率达到96.41%, 远高于其他检测算法。最后, 应用该模型体系对澳门花岗岩单轴抗压强度数据集和均昌隧道的地下水位监测数据进行了异常检测分析, 并分别识别出9个和10个异常点。

     

  • 图 1  流程图

    Figure 1.  Flow chart

    图 2  各变量分布及相关性可视化图

    Figure 2.  Visualization of the distribution and correlation of each variable

    图 3  本次变化量和累计变化量时序图

    图中63, 1 440分别代表监测组编号及地下水位(m)

    Figure 3.  Time sequence diagram of current change and cumulative change

    表  1  分类结果混淆矩阵

    Table  1.   Confusion matrix of classification results

    真实情况 预测结果
    正例 反例
    正例 TP(真正例) FN(假反例)
    反例 FP(假正例) TN(真反例)
    下载: 导出CSV

    表  2  glass数据模型选择结果

    Table  2.   The selection results of the glass data model

    排名 模型 参数(领域数)
    1 ABOD 5
    2 ABOD 15
    3 ABOD 20
    4 ABOD 25
    5 IForest (20,0.2)
    下载: 导出CSV

    表  3  glass数据集在不同参数下的混淆矩阵和查准率

    Table  3.   Confusion matrix and precision of the glass dataset under different parameters

    模型 TP FN FP TN 查准率/% 运算时间/s
    ABOD(5) 187 18 7 2 96.39 0.100 8
    ABOD(15) 186 19 7 2 96.37 0.435 9
    ABOD(20) 187 18 7 2 93.39 0.770 0
    ABOD(25) 186 19 6 3 96.37 1.364 4
    ABOD(16) 188 17 7 2 96.41 0.484 9
    下载: 导出CSV

    表  4  该模型算法与常规算法的检测结果比较

    Table  4.   Comparison of the detection results of the model algorithm and the conventional algorithm

    模型 TP FN FP TN 查准率/% 运算时间/s
    ABOD (16) 188 17 7 2 96.41 0.484 9
    COF 185 20 7 2 90.24 0.195 8
    HBOS 184 21 8 1 89.76 1.303 5
    OCSVM 184 21 8 1 89.76 0.014 8
    LODA 184 21 8 1 89.76 0.027 4
    CBLOF 185 20 7 2 90.24 1.096 7
    COPOD 184 21 8 1 89.76 0.036 9
    MCD 183 22 9 0 89.27 0.060 5
    PCA 184 21 8 1 89.76 0.012 6
    IForest 185 20 7 2 90.24 0.171 3
    下载: 导出CSV

    表  5  Ⅲ级花岗岩数据各变量间相关性

    Table  5.   Correlation among variables of the Macau Ⅲ-level granite dataset

    UCS IS 50 RL vp ηe Gs
    UCS 1 0.75 0.66 0.77 -0.56 0.48
    IS 50 1 0.65 0.46 -0.50 0.41
    RL 1 0.50 -0.68 0.57
    vp 1 -0.44 0.37
    ηe 1 0.91
    Gs 1
    注:UCS.单轴抗压强度;IS 50.点荷载指数;RL.施密特锤回弹值;vp.纵波波速;ηe.有效孔隙率;Gs.相对密度
    下载: 导出CSV

    表  6  Ⅲ级花岗岩数据模型选择结果

    Table  6.   Selection results of the Macau Ⅲ-level granite dataset model

    排名 模型 参数
    1 HBOS (5,0.1)
    2 IForest (20,0.5)
    3 IForest (75,0.3)
    4 IForest (200,0.2)
    5 IForest (20,0.2)
    下载: 导出CSV

    表  7  均昌隧道的地下水位监测数据模型选择结果

    Table  7.   Selected results of the groundwater monitoring data model for the Junchang Tunnel

    排名 模型 参数
    1 LODA (25, 200)
    2 LODA (15, 20)
    3 IForest (50, 0.2)
    4 LODA (25, 50)
    5 LODA (30, 20)
    下载: 导出CSV
  • [1] Bieniawski Z T. The point-load test in geotechnical practice[J]. Engineering Geology, 1975, 9(1): 1-11. doi: 10.1016/0013-7952(75)90024-1
    [2] Cargill J S, Shakoor A. Evaluation of empirical methods for measuring the uniaxial compressive strength of rock[J]. International Journal of Rock Mechanics and Mining Sciences & Geomechanics Abstracts. Pergamon, 1990, 27(6): 495-503. https://www.sciencedirect.com/science/article/pii/014890629091001N
    [3] 柴波, 陶阳阳, 杜娟, 等. 基于Hoek-Brown准则的节理岩体能量参数估算[J]. 地质科技通报, 2020, 39(1): 78-85. https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ202001010.htm

    Chai B, Tao Y Y, Du J, et al. Energetics parameter estimation of jointed rock mass based on Hoek-Brown failure criterion[J]. Bulletin of Geological Scienceand Technology, 2020, 39(1): 78-85(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ202001010.htm
    [4] 李宗强, 李居铜, 张爱勤, 等. 土木工程试验方法与数据处理[M]. 哈尔滨: 哈尔滨工业大学出版社, 2014.

    Li Z Q, Li J T, Zhang A Q, et al. Civil engineering test methods and data processing[M]. Harbin: Harbin Institute of Technology Press, 2014(in Chinese).
    [5] 马祥配, 刘福臣. 岩土工程检测常见问题处理[J]. 中国勘察设计, 2013(3): 96-98. https://www.cnki.com.cn/Article/CJFDTOTAL-KCSJ201303031.htm

    Ma X P, Liu F C. Handling of common problems in geotechnical engineering inspection[J]. China Engineering Consulting, 2013(3): 96-98(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-KCSJ201303031.htm
    [6] 杨威. 郴宁高速公路万华岩边坡监测与稳定性评价方法研究[D]. 长沙: 中南大学, 2013.

    Yang W. Study of monitoring methods and stability evaluation method of Wanhuayan slope in Chenning highway[D]. Changsha: Central South University, 2013(in Chinese).
    [7] 李明. 盾构隧道长期健康监测与评价[D]. 北京: 中国科学院大学, 2015.

    Li M. Long-term health monitoring and evaluation of shield tunnels[D]. Beijing: University of Chinese Academy of Sciences, 2015(in Chinese).
    [8] 刘鸿飞, 刘俊芳, 苏跃宏, 等. 无侧限抗压强度异常值处理新方法的研究[J]. 岩土工程学报, 2020, 42(增刊1): 137-140. https://www.cnki.com.cn/Article/CJFDTOTAL-YTGC2020S1027.htm

    Liu H F, Liu J F, Su Y H, et al. New method for dealing with unconfined compressive strength outliers[J]. Chinese Journal of Geotechnical Engineering, 2020, 42(S1): 137-140(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-YTGC2020S1027.htm
    [9] Mu H Q, Yuen K V. Novel outlier-resistant extended Kalman filter for robust online structural identification[J]. Journal of Engineering Mechanics, 2015, 141(1): 04014100. doi: 10.1061/(ASCE)EM.1943-7889.0000810
    [10] Ramaswamy S, Rastogi R, Shim K. Efficient algorithms for mining outliers from large data sets[C]//Anon. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. [S.l.]: [s.n.], 2000: 427-438.
    [11] Breunig M M, Kriegel H P, Ng R T, et al. LOF: Identifying density-based local outliers[C]//Anon. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. [S.l.]: [s.n.], 2000: 93-104.
    [12] 谭文侃, 叶义成, 胡南燕, 等. LOF与改进SMOTE算法组合的强烈岩爆预测[J]. 岩石力学与工程学报, 2021, 40(6): 1186-1194. https://www.cnki.com.cn/Article/CJFDTOTAL-YSLX202106010.htm

    Tan W K, Ye Y C, Hu N Y, et al. Severe rock burst prediction based on the combination of LOF and improved SMOTE algorithm[J]. Chinese Journal of Rock Mechanics and Engineering, 2021, 40(6): 1186-1194(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-YSLX202106010.htm
    [13] He Z, Xu X, Deng S. Discovering cluster-based local outliers[J]. Pattern Recognition Letters, 2003, 24(9/10): 1641-1650.
    [14] Tang J, Chen Z, Fu A W C, et al. Enhancing effectiveness of outlier detections for low density patterns[C]//Anon. Pacific-Asia Conference on Knowledge Discovery and Data Mining. Berlin, Heidelberg: Springer, 2002: 535-548.
    [15] Papadimitriou S, Kitagawa H, Gibbons P B, et al. Loci: Fast outlier detection using the local correlation integral[C]//Anon. Proceedings 19th International Conference on Data Engineering (Cat. No. 03CH37405). [S.l.]: IEEE, 2003: 315-326.
    [16] Liu F T, Ting K M, Zhou Z H. Isolation forest[C]//Anon. 2008 Eighthieee International Conference on Data Mining. [S.l.]: IEEE, 2008: 413-422.
    [17] Zhao Y, Rossi R A, Akoglu L. Automating outlier detection via meta-learning[J]. arXiv preprint arXiv: 2009.10606, 2020.
    [18] Wolpert D H, Macready W G. No free lunch theorems for optimization[J]. IEEE Transactions on Evolutionary Computation, 1997, 1(1): 67-82. doi: 10.1109/4235.585893
    [19] Bache K, Lichman M. UCI Machine learning repository (2013)[EB/OL]. [2013-03-27](2021-05-10). Available: http://archive.ics.uci.edu/ml
    [20] Chen L, Guo Z, Yin K, et al. The influence of land use and land cover change on landslide susceptibility: a case study in Zhushan Town, Xuan′en County (Hubei, China)[J]. Natural hazards and earth system sciences, 2019, 19(10): 2207-2228. doi: 10.5194/nhess-19-2207-2019
    [21] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.

    Zhou Z H. Machine learning[M]. Beijing: Tsinghua University Press, 2016(in Chinese).
    [22] Kriegel H P, Schubert M, Zimek A. Angle-based outlier detection in high-dimensional data[C]//Anon. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. [S.l.]: [s.n.], 2008: 444-452.
    [23] 柏道远, 李彬, 李银敏, 等. 湖南常德-安仁断裂印支期构造运动分段性: 来自花岗岩的约束[J]. 地质科技通报, 2021, 40(5): 173-187. https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ202105019.htm

    Bai D Y, Li B, Li Y M, et al. Segmentation of the movement in Indosinian of the Changde-Anren fault in Hunan: Constraints from granite[J]. Bulletin of Geological Science and Technology, 2021, 40(5): 173-187(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ202105019.htm
    [24] Ng I T, Yuen K V, Lau C H. Predictive model for uniaxial compressive strength for Grade Ⅲ granitic rocks from Macao[J]. Engineering Geology, 2015, 199: 28-37. doi: 10.1016/j.enggeo.2015.10.008
    [25] 姚文礼. 四川盆地须家河组致密砂岩物源体系的控储作用[J]. 地质科技通报, 2021, 40(5): 223-230. https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ202105023.htm

    Yao W L. Reservoir control of tight sandstone provenance system in Xujiahe formation, Sichuan Basin[J]. Bulletin of Geological Science and Technology, 2021, 40(5): 223-230(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ202105023.htm
    [26] Goldstein M, Dengel A. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm[C]//Anon. KI-2012: Poster and demo track. [S.l.]: [s.n.], 2012: 59-63.
    [27] 江欣悦, 李静, 郭林, 等. 豫北平原浅层地下水化学特征与成因机制[J]. 地质科技通报, 2021, 40(5): 290-300. https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ202105030.htm

    Jiang X Y, Li J, Guo L, et al. Chemical characteristics and formation mechanism of shallow ground water in the northern Henan Plain[J]. Bulletin of Geological Scienceand Technology, 2021, 40(5): 290-300(in Chinese with English abstract). https://www.cnki.com.cn/Article/CJFDTOTAL-DZKQ202105030.htm
    [28] Pevn T. Loda: Lightweight on-line detector of anomalies[J]. Machine Learning, 2016, 102(2): 275-304. doi: 10.1007/s10994-015-5521-0
  • 加载中
图(3) / 表(7)
计量
  • 文章访问数:  825
  • PDF下载量:  46
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-05-27

目录

    /

    返回文章
    返回