研究生: |
張先昀 Hsien-Yun Chang |
---|---|
論文名稱: |
Newton — 基於深度學習之高速智慧標記資料雲端平台 Newton: High-speed Web-based Intelligent Labeling Platform Based On Deep Learning |
指導教授: |
戴文凱
Wen-Kai Tai |
口試委員: |
鮑興國
Hsing-Kuo Pao 章耀勳 Yao-Xun Chang |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 42 |
中文關鍵詞: | 深度學習 、快速標記平台 、快速標記工具 、物件辨識標記 、影像辨識標記 、影像去背標記 、影像分割標記 |
外文關鍵詞: | fast annotation platform, web-based annotation tool, intelligent labeling platform, intelligent annotation platform |
相關次數: | 點閱:701 下載:18 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
標記資料在深度學習的流程上可以說是相當重要的基礎,不論是哪一個領域的學習,大量的訓練資料有助於提升模型的表現。不過要有精準的訓練資料需要耗費大量的人力與時間,雖然學術領域中有許多公開的資料集能夠使用,但是如果希望在某個特定的研究領域精進,勢必要蒐集自己需要的資料類型並標記後加入訓練。以往在標記物件辨識(Object Detection)或影像分割(Image Segmentation)的資料時會花費許多時間,不論是使用多邊形框選的方式標記,或是使用筆刷塗抹的方式標記,在物件較多的圖片上或物件細節較多的資料上都要處理較久的時間。
在此論文中,我們提出了一個基於深度學習的高速智慧標記平台,主要協助影像辨識領域的標記工作,透過較泛用的 Object Detection Model 與 Image Matting Model 的輔助加速標記的流程,減輕人力成本。我們會先將資料透過 Object Detection Model 進行預辨識,得到圖片中感興趣的物件後,再將物件的區域使用 Image Matting Model 進行前景與背景的分割,進而得到前景的邊緣後,不僅能獲得精準的 bounding box 供物件辨識的訓練使用,還能取得 alpha map 供影像去背的訓練使用,也能轉為 semantic segmentation、instance segmentation、panoptic segmentation 供影像分割的訓練使用。
在標記資料的過程中,除了能夠即時重新辨識物件的前景與背景,也能夠使用互動式筆刷,來指導模型前景與背景的位置,進而提高資料的精確性。除了能夠加速標記的流程外,平台還能夠使用多個帳號登入,並且有帳號權限的控管。為了讓標記人員與驗收的人員以最簡單但最快速的步驟使用平台,在標記者登入後,除了標記的功能外,還能直接看到先前標記的資料狀態,確認是否有被退回需重新標記;而驗收者登入後也能直接進到驗收的功能進行資料的確認;管理者則能夠在平台上即時的看到所有資料的標記進度與統計數據。
Annotated data is an essential foundation in the process of deep learning. No matter what field of study, a large amount of training data help improve the model's performance. However, it takes a lot of workforce and time to have accurate training materials. Although many public data sets are available in the academic field, you must collect the types of data you need and label them in a specific research field. It's time-consuming to mark the data for object recognition or image segmentation, whether marked by a polygonal marquee or painted by brush on pictures with many objects or data with more details.
In this thesis, we propose a high-speed intelligent labeling platform based on deep learning, for asisting the labeling in the field of image recognition to reduce labor costs and accelerate the labeling process through Object Detection Model and Image Matting Model. We first pre-label the data through the Object Detection Model to obtain the object of interest in the image and then use the Image Matting Model to segment the foreground and background of the object area and get the edge of the foreground. Not only can we get an accurate bounding box used for object detection training, and the alpha map can also be obtained for image matting training and converted to semantic segmentation, instance segmentation, and panoptic segmentation for image segmentation training.
In the process of labeling data, in addition to re-predict the foreground and background of objects in real-time, interactive brushes can also be used to guide the position of the foreground and background of the model, thereby improving the accuracy of the data. Besides, the platform can have multiple accounts to manage the status of the marked data, confirm whether it has been returned and needs to be re-labeled, and analyze and statistical data of all materials on the platform in real-time.
[1] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive
learning of visual representations,” arXiv preprint arXiv:2002.05709, 2020.
[2] Q. Xie, E. Hovy, M.-T. Luong, and Q. V. Le, “Self-training with noisy student improves
imagenet classification,” 2019. cite arxiv:1911.04252.
[3] B. Sekachev, N. Manovich, and A. Zhavoronkov, “Computer vision annotation tool,”
Oct. 2019. https://github.com/opencv/cvat.
[4] D. Acar, “Datagym.ai,” Dec. 2021. https://github.com/datagym-ai/datagym-core.
[5] Datature.io, “Datature,” 2021. https://datature.io/.
[6] Hasty GmbH, “hasty.ai.” https://hasty.ai/.
[7] Supervisely, “Supervise.ly,” 2017. https://supervise.ly/.
[8] Natural Intelligence, “imglab.” https://github.com/NaturalIntelligence/imglab.
[9] Labelbox, “Labelbox.” Online, 2022. [Online]. Available: https://labelbox.com.
[10] Tzutalin, “Labelimg.” Git code (2015). https://github.com/tzutalin/labelImg.
[11] M. Tkachenko, M. Malyuk, A. Holmanyuk, and N. Liubimov, “Label Studio:
Data labeling software,” 2020-2022. Open source software available from https://
github.com/heartexlabs/label-studio.
[12] P. Skalski, “Make Sense.” https://github.com/SkalskiP/make-sense/, 2019.
[13] Remo.ai, “Remo.ai: Image Datasets management.” https://github.com/
rediscovery-io/remo-python, 2019.
[14] Segments.ai, “Segments.ai.” https://segments.ai/.
[15] SuperAnnotate, “Superannotate.” https://www.superannotate.com/.
[16] V. Labs, “V7.” https://www.v7labs.com/.
[17] A. Dutta, A. Gupta, and A. Zissermann, “VGG image annotator (VIA).” http://
www.robots.ox.ac.uk/~vgg/software/via/, 2016.
[18] Microsoft, “VoTT (Visual Object Tagging Tool).” https:// github.com/ microsoft/
VoTT, 2019.
[19] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” 2017. cite
arxiv:1703.06870Comment: open source; appendix on more results.
[20] S. Ren, K. He, R. B. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object
detection with region proposal networks.,” in NIPS (C. Cortes, N. D. Lawrence, D. D.
Lee, M. Sugiyama, and R. Garnett, eds.), pp. 91–99, 2015.
[21] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” 2018. cite
arxiv:1804.02767Comment: Tech Report.
[22] T.-Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan,
C. L. Zitnick, and P. Dollár, “Microsoft COCO: Common Objects in Context,”
arXiv e-prints, p. arXiv:1405.0312, May 2014.
[23] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman,
“The PASCAL Visual Object Classes Challenge 2012 (VOC2012) Results.” http://
www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html, 2012.