簡易檢索 / 詳目顯示

研究生: 陳小星
Julianne Agatha Tan
論文名稱: Contextual Fashion Compatibility
Contextual Fashion Compatibility
指導教授: 花凱龍
Kai-Lung Hua
口試委員: 鄭文皇
Weng-Huang Cheng
陳駿丞
Jun-Cheng Chen
余能豪
Neng-Hao Yu
郭彥甫
Yan-Fu Kuo
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 34
中文關鍵詞: Fashion Compatibility
外文關鍵詞: Fashion Compatibility
相關次數: 點閱:124下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Fashion compatibility aims to learn how different items complement each other. One of the difficulties in this task is the complex notion of compatibility. Prior works reduce compatibility to visual cues such as color, pattern, and silhouette, but fail to account for contextual compatibility such as occasion, leading to inappropriate recommendations. For instance, black running shorts are visually compatible with a black blazer but are not appropriate for office wear. In this paper, we propose a fashion compatibility framework that takes on an arbitrary number of clothing items and respects an outfit's context, such as occasion, when predicting compatibility. We designed a two-stage model that first learns visual compatibility followed by contextual compatibility in order to capture the hierarchical notion of compatibility wherein contextual compatibility is irrelevant if visual compatibility is not satisfied. The first stage learns visual compatibility through metric learning wherein the model learns pairwise clothing category subspaces that are representative of clothing similarity. The second stage learns a permutation invariant representation of the clothing similarity scores and predicts compatibility score for every context considered.


    Fashion compatibility aims to learn how different items complement each other. One of the difficulties in this task is the complex notion of compatibility. Prior works reduce compatibility to visual cues such as color, pattern, and silhouette, but fail to account for contextual compatibility such as occasion, leading to inappropriate recommendations. For instance, black running shorts are visually compatible with a black blazer but are not appropriate for office wear. In this paper, we propose a fashion compatibility framework that takes on an arbitrary number of clothing items and respects an outfit's context, such as occasion, when predicting compatibility. We designed a two-stage model that first learns visual compatibility followed by contextual compatibility in order to capture the hierarchical notion of compatibility wherein contextual compatibility is irrelevant if visual compatibility is not satisfied. The first stage learns visual compatibility through metric learning wherein the model learns pairwise clothing category subspaces that are representative of clothing similarity. The second stage learns a permutation invariant representation of the clothing similarity scores and predicts compatibility score for every context considered.

    Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . iii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . iv Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Review of Related Literature . . . . . . . . . . . . . . . . . . . 5 3 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.0.1 Visual Compatibility Network (VCN) . . . . . . . 9 3.0.2 Contextual Compatibility Network (CCN) . . . . . 12 4 Results and Discussions . . . . . . . . . . . . . . . . . . . . . . 15 4.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.1 Implementation Details . . . . . . . . . . . . . . . 15 4.1.2 Baselines . . . . . . . . . . . . . . . . . . . . . . 16 4.1.3 Compatibility and FITB Experiments . . . . . . . 18 4.1.4 Qualitative Experiments . . . . . . . . . . . . . . 20 4.1.5 Ablation. . . . . . . . . . . . . . . . . . . . . . . 23 4.1.6 On dataset requirements. . . . . . . . . . . . . . . 27 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Letter of Authority . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    [1] A. Veit, B. Kovacs, S. Bell, J. McAuley, K. Bala, and S. Belongie, “Learning visual clothing style
    with heterogeneous dyadic cooccurrences,”
    in Proceedings of the IEEE International Conference on
    Computer Vision, pp. 4642–4650, 2015.
    [2] X. Han, Z. Wu, Y.G.
    Jiang, and L. S. Davis, “Learning fashion compatibility with bidirectional lstms,”
    in Proceedings of the 25th ACM international conference on Multimedia, pp. 1078–1086, 2017.
    [3] Y. Li, L. Cao, J. Zhu, and J. Luo, “Mining fashion outfit composition using an endtoend
    deep learning
    approach on set data,” IEEE Transactions on Multimedia, vol. 19, no. 8, pp. 1946–1955, 2017.
    [4] M. I. Vasileva, B. A. Plummer, K. Dusad, S. Rajpal, R. Kumar, and D. Forsyth, “Learning typeaware
    embeddings for fashion compatibility,” in Proceedings of the European Conference on Computer Vision
    (ECCV), pp. 390–405, 2018.
    [5] L.F.
    Yu, S. K. Yeung, D. Terzopoulos, and T. F. Chan, “Dressup!: outfit synthesis through automatic
    optimization.,” ACM Trans. Graph., vol. 31, no. 6, pp. 134–1, 2012.
    [6] W. Chen, P. Huang, J. Xu, X. Guo, C. Guo, F. Sun, C. Li, A. Pfadler, H. Zhao, and B. Zhao, “Pog:
    Personalized outfit generation for fashion recommendation at alibaba ifashion,” in Proceedings of the
    25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2662–
    2670, 2019.
    [7] X. Zhang, J. Jia, K. Gao, Y. Zhang, D. Zhang, J. Li, and Q. Tian, “Trip outfits advisor: Locationoriented
    clothing recommendation,” IEEE Transactions on Multimedia, vol. 19, no. 11, pp. 2533–
    2544, 2017.
    [8] W.C.
    Kang, E. Kim, J. Leskovec, C. Rosenberg, and J. McAuley, “Complete the look: Scenebased
    complementary product recommendation,” in Proceedings of the IEEE Conference on Computer Vision
    and Pattern Recognition, pp. 10532–10541, 2019.
    [9] W.L.
    Hsiao and K. Grauman, “Creating capsule wardrobes from fashion images,” in Proceedings of
    the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7161–7170, 2018.
    [10] X. Dong, X. Song, F. Feng, P. Jing, X.S.
    Xu, and L. Nie, “Personalized capsule wardrobe creation
    with garment and user modeling,” in Proceedings of the 27th ACM International Conference on Multimedia,
    pp. 302–310, 2019.
    [11] R. Tan, M. I. Vasileva, K. Saenko, and B. A. Plummer, “Learning similarity conditions without explicit
    supervision,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 10373–
    10382, 2019.
    [12] Y.L.
    Lin, S. Tran, and L. S. Davis, “Fashion outfit complementary item retrieval,” in Proceedings of
    the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3311–3319, 2020.
    [13] J. McAuley, C. Targett, Q. Shi, and A. Van Den Hengel, “Imagebased
    recommendations on styles
    and substitutes,” in Proceedings of the 38th international ACM SIGIR conference on research and
    development in information retrieval, pp. 43–52, 2015.
    [14] Y.S.
    Shih, K.Y.
    Chang, H.T.
    Lin, and M. Sun, “Compatibility family learning for item recommendation
    and generation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32,
    2018.
    [15] C. Yu, Y. Hu, Y. Chen, and B. Zeng, “Personalized fashion design,” in Proceedings of the IEEE
    International Conference on Computer Vision, pp. 9046–9055, 2019.
    [16] R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel, “Imagenettrained
    CNNs are biased towards texture; increasing shape bias improves accuracy and robustness.,”
    in International Conference on Learning Representations, 2019.
    [17] H. Hosseini, B. Xiao, M. Jaiswal, and R. Poovendran, “Assessing shape bias property of convolutional
    neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern
    Recognition Workshops, pp. 1923–1931, 2018.
    [18] X. Song, F. Feng, X. Han, X. Yang, W. Liu, and L. Nie, “Neural compatibility modeling with attentive
    knowledge distillation,” in The 41st International ACM SIGIR Conference on Research & Development
    in Information Retrieval, pp. 5–14, 2018.
    [19] A. Veit, S. Belongie, and T. Karaletsos, “Conditional similarity networks,” in Proceedings of the IEEE
    Conference on Computer Vision and Pattern Recognition, pp. 830–838, 2017.
    [20] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings
    of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
    [21] J.H.
    Lai, B. Wu, X. Wang, D. Zeng, T. Mei, and J. Liu, “Themematters:
    Fashion compatibility
    learning via theme attention,” arXiv preprint arXiv:1912.06227, 2019.
    [22] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla,
    M. Bernstein, et al., “Imagenet large scale visual recognition challenge,” International journal of
    computer vision, vol. 115, no. 3, pp. 211–252, 2015.
    [23] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for
    computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition,
    pp. 2818–2826, 2016.
    [24] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural
    networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017.
    [25] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in
    Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP),
    pp. 1532–1543, 2014.
    [26] Y. Shen, Y. Xiong, W. Xia, and S. Soatto, “Towards backwardcompatible
    representation learning,”
    in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6368–
    6377, 2020.

    無法下載圖示 全文公開日期 2026/02/01 (校內網路)
    全文公開日期 2026/02/01 (校外網路)
    全文公開日期 2026/02/01 (國家圖書館:臺灣博碩士論文系統)
    QR CODE