簡易檢索 / 詳目顯示

研究生: 何學昕
Xue-Xin He
論文名稱: 異質處理器之截止期限感知的記憶體管理
Deadline-aware Memory Scheduler and Governor for Heterogeneous Processors
指導教授: 陳雅淑
Ya-Shu Chen
口試委員: 陳雅淑
Ya-Shu Chen
陳筱青
Hsiao-Chin Chen
謝仁偉
Jen-Wei Hsieh
吳晉賢
Chin-Hsien Wu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 36
中文關鍵詞: 功耗管理動態電壓頻率調整記憶體感知
外文關鍵詞: energy management, DVFS, memory-aware
相關次數: 點閱:162下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

為追求效能與耗電量的最佳平衡,異質計算系統結合記憶體子系統成為移動裝置的標準硬體配備。然而,應用程式在這類的系統上執行,需要在不同功耗的單元上執行,且必須滿足使用者品質要求,使得這類系統上的耗能感知排程變得相當具有挑戰。我們提出期限感知的排程器來處理因記憶體頻寬搶奪引起的效能議題,並提出即時的記憶體超頻反饋機制、與兼顧異質處理器與記憶體功耗的動態電壓頻率調節器來降低耗能。


Mobile systems are usually equipped with heterogeneous processors and memories to trade off power and performance. The energy-efficient scheduling of such system is difficulty from two folders: tasks executed on such system with the end-to-end deadline and with varied power consumption from the heterogeneous component. A deadline-aware scheduler is proposed to deal with the performance degradation from memory bandwidth contention. A DVFS governor to manage processors and memory, and the run-time memory overclocking reclaiming is presented to minimize the energy consumption while maintaining the performance. Evaluation results show considerable schedulability and energy conservation using this framework.

1. Introduction 2. System Model 3. Approaches 3.1 Scheduling Flow 3.2 Motivated Example 3.3 Deadline-aware Scheduler 3.4 DVFS Governor 4. Performance Evaluation 4.1 Experimental Setup 4.2 Experimental Results 4.2.1 Schedulability 4.2.2 Energy Conservation 4.2.3 Memory Overclocking Reclaiming 5. Conclusion References

[1] S. Rixner, W. J. Dally, U. J. Kapasi, P. Mattson, and J. D. Owens,
“Memory access scheduling,” in ACM SIGARCH Computer Architecture
News, vol. 28, no. 2. ACM, 2000, pp. 128–138.
[2] D. Xiong, K. Huang, X. Jiang, and X. Yan, “Memory access scheduling
based on dynamic multilevel priority in shared dram systems,” ACM
Transactions on Architecture and Code Optimization (TACO), vol. 13,
no. 4, p. 42, 2016.
[3] Y. Kim, M. Papamichael, O. Mutlu, and M. Harchol-Balter, “Thread
cluster memory scheduling: Exploiting differences in memory access
behavior,” in Proceedings of the 2010 43rd Annual IEEE/ACM International
Symposium on Microarchitecture. IEEE Computer Society,
2010, pp. 65–76.
[4] E. Ipek, O. Mutlu, J. F. Mart´ınez, and R. Caruana, “Self-optimizing
memory controllers: A reinforcement learning approach,” in Computer
Architecture, 2008. ISCA’08. 35th International Symposium on. IEEE,
2008, pp. 39–50.
[5] R. Das, R. Ausavarungnirun, O. Mutlu, A. Kumar, and M. Azimi,
“Application-to-core mapping policies to reduce memory system interference
in multi-core systems,” in High Performance Computer Architecture
(HPCA2013), 2013 IEEE 19th International Symposium on. IEEE,
2013, pp. 107–118.
[6] H. David, C. Fallin, E. Gorbatov, U. R. Hanebutte, and O. Mutlu,
“Memory power management via dynamic voltage/frequency scaling,”
in Proceedings of the 8th ACM international conference on Autonomic
computing. ACM, 2011, pp. 31–40.
[7] Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and R. Bianchini,
“Multiscale: memory system dvfs with multiple memory controllers,” in
Proceedings of the 2012 ACM/IEEE international symposium on Low
power electronics and design. ACM, 2012, pp. 297–302.
[8] J. Jang and M. Park, “Dram frequency scaling for energy efficiency
based on memory usage,” in Consumer Electronics (ICCE), 2017 IEEE
International Conference on. IEEE, 2017, pp. 308–309.
[9] X. Fan, C. S. Ellis, and A. R. Lebeck, “Memory controller policies
for dram power management,” in Low Power Electronics and Design,
International Symposium on, 2001. IEEE, 2001, pp. 129–134.
[10] C.-H. Lin, C.-L. Yang, and K.-J. King, “Ppt: joint performance/power/
thermal management of dram memory for multi-core systems,” in
Proceedings of the 2009 ACM/IEEE international symposium on Low
power electronics and design. ACM, 2009, pp. 93–98.
[11] S. Liu, K. Pattabiraman, T. Moscibroda, and B. G. Zorn, “Flikker: saving
dram refresh-power through critical data partitioning,” ACM SIGPLAN
Notices, vol. 47, no. 4, pp. 213–224, 2012.
[12] N. Chatterjee, M. OConnor, D. Lee, D. R. Johnson, S. W. Keckler,
M. Rhu, and W. J. Dally, “Architecting an energy-efficient dram system
for gpus,” in High Performance Computer Architecture (HPCA), 2017
IEEE International Symposium on. IEEE, 2017, pp. 73–84.
[13] Q. Deng, D. Meisner, A. Bhattacharjee, T. F. Wenisch, and R. Bianchini,
“Coscale: Coordinating cpu and memory system dvfs in server systems,”
in Proceedings of the 2012 45th Annual IEEE/ACM International
Symposium on Microarchitecture. IEEE Computer Society, 2012, pp.
143–154.
[14] C. Fu, Y. Zhao, M. Li, and C. J. Xue, “Maximizing common idle time
on multicore processors with shared memory,” IEEE Transactions on
Very Large Scale Integration (VLSI) Systems, 2017.
[15] S. Rai and M. Chaudhuri, “Improving cpu performance through dynamic
gpu access throttling in cpu-gpu heterogeneous processors,” in Parallel
and Distributed Processing Symposium Workshops (IPDPSW), 2017
IEEE International. IEEE, 2017, pp. 18–29.
[16] G. A. Chaparro-Baquero, S. Sha, S. Homsi, W. Wen, and G. Quan, “Processor/
memory co-scheduling using periodic resource server for realtime
systems under peak temperature constraints,” in Quality Electronic
Design (ISQED), 2017 18th International Symposium on. IEEE, 2017,
pp. 360–366.
[17] S. Rai and M. Chaudhuri, “Using criticality of gpu accesses in memory
management for cpu-gpu heterogeneous multi-core processors,” ACM
Transactions on Embedded Computing Systems (TECS), vol. 16, no. 5s,
p. 133, 2017.
[18] Y. Jiao, H. Lin, P. Balaji, and W.-c. Feng, “Power and performance characterization
of computational kernels on the gpu,” in Green Computing
and Communications (GreenCom), 2010 IEEE/ACM Int’l Conference on
& Int’l Conference on Cyber, Physical and Social Computing (CPSCom).
IEEE, 2010, pp. 221–228.
[19] M. K. Jeong, M. Erez, C. Sudanthi, and N. Paver, “A qos-aware memory
controller for dynamically balancing gpu and cpu bandwidth use in
an mpsoc,” in Proceedings of the 49th Annual Design Automation
Conference. ACM, 2012, pp. 850–855.
[20] J. V. Quiroga Esparza, “Heterogeneous cpu/gpu memory hierarchy
analysis and optimization,” Master’s thesis, Universitat Polit`ecnica de
Catalunya, 2015.
[21] N. Rafique, W.-T. Lim, and M. Thottethodi, “Effective management
of dram bandwidth in multicore processors,” in Parallel Architecture
and Compilation Techniques, 2007. PACT 2007. 16th International
Conference on. IEEE, 2007, pp. 245–258.
[22] D. Xu, C. Wu, and P.-C. Yew, “On mitigating memory bandwidth
contention through bandwidth-aware scheduling,” in Proceedings of the
19th international conference on Parallel architectures and compilation
techniques. ACM, 2010, pp. 237–248.
[23] H. Yun, G. Yao, R. Pellizzoni, M. Caccamo, and L. Sha, “Memguard:
Memory bandwidth reservation system for efficient performance isolation
in multi-core platforms,” in Real-Time and Embedded Technology
and Applications Symposium (RTAS), 2013 IEEE 19th. IEEE, 2013,
pp. 55–64.
[24] C.-Y. Hsieh, J.-G. Park, N. Dutt, and S.-S. Lim, “Memory-aware cooperative
cpu-gpu dvfs governor for mobile games,” in Embedded Systems
For Real-time Multimedia (ESTIMedia), 2015 13th IEEE Symposium on.
IEEE, 2015, pp. 1–8.
[25] N. Fisher, J.-J. Chen, S. Wang, and L. Thiele, “Thermal-aware global
real-time scheduling on multicore systems,” in Real-Time and Embedded
Technology and Applications Symposium, 2009. RTAS 2009. 15th IEEE.
IEEE, 2009, pp. 131–140.
[26] ARM, “Armv8-a power management,” http://docs-api-peg.northeurope.
cloudapp.azure.com/assets/100960/0100/armv8 a power management
100960 0100 en.pdf, 2013.
[27] F. Paterna and T. ˇS. Rosing, “Modeling and mitigation of extra-soc thermal
coupling effects and heat transfer variations in mobile devices,” in
Proceedings of the IEEE/ACM International Conference on Computer-
Aided Design. IEEE Press, 2015, pp. 831–838.
[28] J.-J. Chen and C.-F. Kuo, “Energy-efficient scheduling for real-time
systems on dynamic voltage scaling (dvs) platforms,” in Embedded and
Real-Time Computing Systems and Applications, 2007. RTCSA 2007.
13th IEEE International Conference on. IEEE, 2007, pp. 28–38.
[29] Micron, “Calculating memory system power for ddr3,”
https://www.micron.com//media/documents/products/technical-note/
dram/tn41 01ddr3 power.pdf?la=en, 2007.
[30] ——, “Ddr3 sdram system power calculator,” https://www.micron.com/
/media/documents/products/power-calculator/ddr3 power calc.xlsm?
la=en, 2011.
[31] Z. P. Wu, Y. Krish, and R. Pellizzoni, “Worst case analysis of dram
latency in multi-requestor systems,” in Real-Time Systems Symposium
(RTSS), 2013 IEEE 34th. IEEE, 2013, pp. 372–383.
[32] JEDEC, “Jedec standard ddr3 sdram jesd79 3c,” http://mermaja.act.uji.
es/docencia/is37/data/DDR3.pdf, 2012.
[33] O. U. P. Zapata and P. M. Alvarez, “Edf and rm multiprocessor
scheduling algorithms: Survey and performance evaluation,” Seccion de
Computacion Av. IPN, vol. 2508, 2005.
[34] H. Usui, L. Subramanian, K. K.-W. Chang, and O. Mutlu, “Dash:
Deadline-aware high-performance memory scheduler for heterogeneous
systems with hardware accelerators,” ACM Transactions on Architecture
and Code Optimization (TACO), vol. 12, no. 4, p. 65, 2016.
[35] Notebookcheck, “Apple a7,” https://www.notebookcheck.net/
Apple-A7-Smartphone-SoC.103280.0.html, 2013.
[36] A. Pathania, Q. Jiao, A. Prakash, and T. Mitra, “Integrated cpu-gpu
power management for 3d mobile games,” in Proceedings of the 51st
Annual Design Automation Conference. ACM, 2014, pp. 1–6.

無法下載圖示 全文公開日期 2023/02/08 (校內網路)
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE