人工智能驱动下的耳科影像分析:技术革新与应用前景

AI-driven audiological imaging analysis: technological innovations and application prospects

李孛;蒋忻洋

1:上海交通大学医学院附属第九人民医院耳鼻咽喉头颈外科

2:微软亚洲研究院

摘要
<正>人工智能(artificial intelligence,AI)自20世纪50年代提出以来,历经符号主义、连接主义等范式演变,逐步从理论探索走向实际应用。21世纪初,以卷积神经网络(convolutional neural network,CNN)为代表的深度学习技术,借助大规模数据和算力突破,在图像识别、语音处理等领域取得革命性进展。2017年后,随着Transformer架构的提出,AI技术进入“大模型驱动”的新阶段:基于自注意力机制的模型不仅彻底改变了自然语言处理范式,更通过预训练-微调模式实现了跨任务泛化能力。
关键词
基金项目(Foundation):
作者
李孛;蒋忻洋
参考文献

[1] RADFORD A,KIM J W,HALLACY C,et al.Learning transferable visual models from natural language supervision[C].PMLR:International conference on machine learning,2021.

[2] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16x16 words:transformers for image recognition at scale[C].International conference on learning representations,2021.

[3] TSCHANDL P,RINNER C,APALLA Z,et al.Human-computer collaboration for skin cancer recognition[J].Nature Medicine,2020,26(8):1229-1234.

[4] CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C].Springer:European conference on computer vision,2020.

[5] XIE E,WANG W,YU Z,et al.Segformer:simple and efficient design for semantic segmentation with transformers[J].Advances in Neural Information Processing Systems,2021,34:12077-12090.

[6] XIE W,NOBLE J A,ZISSERMAN A A.Microscopy cell counting and detection with fully convolutional regression networks[J].Computer Methods in Biomechanics and Biomedical Engineering:Imaging & Visualization,2018,6(3):283-292.

[7] WANG X,YU K,WU S,et al.Esrgan:Enhanced super-resolution generative adversarial networks[C].Proceedings of the European conference on computer vision (ECCV) workshops,2018.

[8] LIANG J,CAO J,SUN G,et al.Swinir:image restoration using swin transformer[C].Proceedings of the IEEE/CVF international conference on computer vision,2021.

[9] CHEN H,WANG Y,GUO T,et al.Pre-trained image processing transformer[C].Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,2021.

[10] PHAM C,TOR-DíEZ C,MEUNIER H,et al.Multiscale brain MRI super-resolution using deep 3D convolutional networks[J].Computerized Medical Imaging and Graphics,2019,77:101647.

[11] ANAND S,ROSHAN R.Chest X ray image enhancement using deep contrast diffusion learning[J].Optik,2023,279:170751.

[12] HUSSAIN R,LALANDE A,GIRUM K B,et al.Automatic segmentation of inner ear on CT-scan using auto-context convolutional neural network[J].Scientific Reports,2021,11(1):4406.

[13] VAIDYANATHAN A,LUBBE M F V D,LEIJENAAR R T,et al.Deep learning for the fully automated segmentation of the inner ear on MRI[J].Scientific Reports,2021,11(1):2885.

[14] ISENSEE F,JAEGER P F,KOHL S A A,et al.nnU-Net:a self-configuring method for deep learning-based biomedical image segmentation[J].Nature Methods,2021,18(2):203-211.

[15] LIU P,STEUER S,GOLDE J,et al.The dresden in vivo OCT dataset for automatic middle ear segmentation[J].Scientific Data,2024,11(1):242.

[16] NIKAN S,OSCH K V,BARTLING M,et al.PWD-3Dnet:a deep learning-based fully-automated segmentation of multiple structures on temporal bone CT scans[J].IEEE Transactions on Image Processing,2020,30:739-753.

[17] LI X,ZHU Z,YIN H,et al.Labyrinth net:a robust segmentation method for inner ear labyrinth in CT images[J].Computers in Biology and Medicine,2022,146:105630.

[18] LI Z,ZHOU L,TAN S,et al.Application of unetr for automatic cochlear segmentation in temporal bone CTs[J].Auris Nasus Larynx,2023,50(2):212-217.

[19] FAUSER J,STENIN I,BAUER M,et al.Toward an automatic preoperative pipeline for image-guided temporal bone surgery[J].International Journal of Computer Assisted Radiology and Surgery,2019,14:967-976.

[20] ZHOU M,MAO J,LI X,et al.Intelligent analysis and measurement of semicircular canal spatial attitude[J].Frontiers in Neurology,2024,15:1396513.

[21] LI X,WANG T,TANG R,et al.Standard observation plane annotation for diagnosis of ossicular chain in the temporal bone CT images[J].Medical & Biological Engineering & Computing,2025,63(16):1683-1695.

[22] LI H,PRASAD R G,SEKUBOYINA A,et al.Micro-CT synthesis and inner ear super resolution via generative adversarial networks and bayesian inference[C].2021 IEEE 18th international symposium on biomedical imaging (ISBI),IEEE,2021.

[23] LI Z,ZHOU L,BIN X,et al.Utility of deep learning for the diagnosis of cochlear malformation on temporal bone CT[J].Japanese Journal of Radiology,2024,42(3):261-267.

[24] FUJIMA N,ANDREU-ARASA V C,ONOUE K,et al.Utility of deep learning for the diagnosis of otosclerosis on temporal bone CT[J].European Radiology,2021,31:5206-5211.

[25] WANG Y,LI Y,CHENG Y,et al.Deep learning in automated region proposal and diagnosis of chronic otitis media based on computed tomography[J].Ear and Hearing,2020,41(3):669-677.

[26] AFIFY H M,MOHAMMED K K,HASSANIEN A A E.Insight into automatic image diagnosis of ear conditions based on optimized deep learning approach[J].Annals of Biomedical Engineering,2024,52(4):865-876.

[27] OGAWA M,KISOHARA M,YAMAMOTO T,et al.Utility of unsupervised deep learning using a 3D variational autoencoder in detecting inner ear abnormalities on CT images[J].Computers in Biology and Medicine,2022,147:105683.

[28] DIEZ P L,SUNDGAARD J V,MARGETA J,et al.Deep reinforcement learning and convolutional autoencoders for anomaly detection of congenital inner ear malformations in clinical CT images[J].Computerized Medical Imaging and Graphics,2024,113:102343.

[29] OQUAB M,DARCET T,MOUTAKANNI T,et al.Dinov2:learning robust visual features without supervision[J].Transactions on Machine Learning Research Journal,2024:1-31.

[30] HU E J,SHEN Y,WALLIS P,et al.Lora:low-rank adaptation of large language models[J].ICLR,2022,1(2):3.

[31] JIA M,TANG L,CHEN B,et al.Visual prompt tuning[C].Springer:European conference on computer vision,2022.

[32] KIM B,WATTENBERG M,GILMER J,et al.Interpretability beyond feature attribution:quantitative testing with concept activation vectors (TCAV)[C].PMLR:International conference on machine learning,2018.

[33] CUNNINGHAM H,EWART A,RIGGS L,et al.Sparse autoencoders find highly interpretable features in language models[C].International conference on learning representations,2024.

本文信息

PDF(204K)

本文关键词相关文章

本文作者相关文章

李孛蒋忻洋