Publications

(2024). Power Profile Monitoring and Tracking Evolution of System-Wide HPC Workloads. ICDCS'24.
(2024). Understanding GPU Memory Corruption at Extreme Scale: The Summit Case Study. ACM ICS'24.
(2021). Revealing power, energy and thermal dynamics of a 200PF pre-exascale supercomputer. SC21.
(2021). The Challenge of Disproportionate Importance of Temporal Features in Predicting HPC Power Consumption. CLUSTER'21 HPCMASPA Workshop.
(2021). A Conceptual Framework for HPC Operational Data Analytics. CLUSTER'21 EEHPCWG SoP Workshop.
(2020). Hybrid Approach to HPC Cluster Telemetry and Hardware Log Analytics. IEEE HPEC20.
(2020). Global Experiences with HPC Operational Data Measurement, Collection and Analysis. CLUSTER'20 EEHPCWG State of Practice Workshop.
(2019). Data Jockey: Automatic Data Management for HPC Multi-tiered Storage Systems. IEEE IPDPS'19.
(2016). Unblinding the OS to Optimize User-Perceived Flash SSD Latency. USENIX HotStorage'16.
(2015). Providing QoS through host controlled flash SSD garbage collection and multiple SSDs. IEEE BigComp'15.