Full Publication List | Tony@Rochester

Full Publication List:

Journal Publications

[Ph.D. Dissertation] T.Geng: "FPGA-based high-performance neural network acceleration", Boston University.
[TPDS 2021] T.Geng, T.Wang, C.Wu, Y.Li, ..., A.Li, M.Herbordt: "O3BNN-R: An Out-Of-Order Architecture for HighPerformance and Regularized BNN inference", IEEE Transactions on Parallel and Distributed Systems.
[TPDS 2021] C.Tan, C.Xie, T.Geng, ..., K.Barker, A.Li: "ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing", IEEE Transactions on Parallel and Distributed Systems.
[MICPRO 2021] Y. Li, T.Geng, A. Li, H. Yu: BCNN: Binary Complex Neural Network, Microprocssors and Microsystems.
[BDMA 2021] Y. Li, T.Geng, A. Li, H. Yu: GAAF: Searching Activation Functions for Binary Neural Networks through Genetic Algorithm, Journal of Big Data Mining and Analytics.
[CPE 2021] P. Haghi, ..., T.Geng, ..., A. Skjellum, M. C. Herbordt: Reconfigurable Switches for High Performance and Flexible MPI Collectives, Concurrency and Computation: Practice and Experience.
[TC 2020] T.Geng*, T.Wang*, A.Li, X.Jin, M.Herbordt: "FPDeep: Scalable Acceleration of CNN Training on DeeplyPipelined FPGA Clusters", IEEE Transactions on Computers.

Conference Publications

[HPCA 2022] H.You*, T.Geng*, Y.Zhang, A.Li, Y.Lin: GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design, The 28th IEEE International Symposium on HighPerformance Computer Architecture.
[HPCA 2022] C.Tan, N.B.Agostini, T.Geng, C.Xie, J.Li, A.Li, K.Barker, A.Tumeo: DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs, The 28th IEEE International Symposium on High-Performance Computer Architecture.
[DAC 2022] H. Peng, S. Huang, ..., T.Geng, ..., C.Ding: A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining, The 58th Design Automation Conference.
[ICS 2022] C.Zhang, S.Jin, T.Geng, J.Tian, A.Li, D.Tao: "Accelerating Parallel I/O Via Hardware-Algorithm Co-Designed Adaptive Lossy Compression", the 36th ACM International Conference on Supercomputing.
[ICS 2022] C.Tan, T.Tembe, J.Zhang, B.Fang, T.Geng, G.Wei, D.Brooks, A.Tumeo, G.Gopalakrishnan A.Li: "ASAP - Automatic Synthesis of Area-Efficient and Precision-Aware CGRA", the 36th ACM International Conference on Supercomputing.
[FPL 2022] C.Zhang, T.Geng, A.Guo, J.Tian, M.Herbordt, A.Li, D.Tao: " H-GCN: A Graph Convolutional Network Accelerator on Xilinx Versal AI Engines", the 32th International Conference on Field-Programmable Logic and Applications.
[FPL 2022] A.Guo, T.Geng, Y.Zhang, P.Haghi, C.Wu, T.Cheng, Y.Lin, A.Li, M.Herbordt: "A Framework for Neural Network Inference on FPGA-Centric SmartNICs", the 32th International Conference on Field-Programmable Logic and Applications.
[FPL 2022] C.Wu, S.Bandara, T.Geng, A.Guo, P.Haghi, W.Sherman, V.Sachdeva, M.Herbordt: "Optimized Mappings for Symmetric Range-Limited Molecular Force Calculations on FPGAs", the 32th International Conference on Field-Programmable Logic and Applications.
[MICRO 2021] T.Geng, C.Wu, ..., M.Herbordt, Y.Lin, A.Li: I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization, the 54th IEEE/ACM International Symposium on Microarchitecture.
[HPEC 2021] T.Geng, C.Wu, C.Tan, ..., M.Herbordt, A.Li: A Survey: Handling Irregularities in Neural Network Acceleration with FPGAs, IEEE High Performance Extreme Computing Conference.
[SC 2021] B.Feng, Y.Wang, T.Geng, A.Li, Y.Ding: "APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores", Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis.
[ICCAD 2021] Y.Zhang, H.You, Y.Fu, T.Geng, A.Li, Y.Lin: G-CoS: GNN-Accelerator Co-Search Towards Both Better Accuracy and Efficiency, 2021 International Conference On Computer Aided Design.
[ICCAD 2021] D.Manu, ..., T.Geng, A.Li, C.Ding, W.Jiang, L.Yang: BFL-DISCO: Federated Generative Adversarial Network for Graph-based Molecule Drug Discovery, 2021 International Conference On Computer Aided Design.
[ICCAD 2021] H.Peng, ..., T.Geng, A.Li, J.Bi, M.Song, W.Jiang, H.Liu, C.Ding: Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search, 2021 International Conference On Computer Aided Design.
[ICCD 2021] C.Tan, T.Geng, C.Xie, N.Agostini, J.Li, A.Li, K.Barker, A.Tumeo: DynPaC: Coarse-Grained, Dynamic, and Partially Reconfigurable Array for Streaming Applications, the 39th IEEE International Conference on Computer Design. (Best Paper Award)
[FCCM 2021] C.Wu, T.Geng, S.Bandara, C.Yang, V.Sachdeva, W.Sherman, M.Herbordt: "Upgrade of FPGA Range-Limited Molecular Dynamics to Handle Hundreds of Processors", the 30th IEEE International Symposium On Field-Programmable Custom Computing Machines.
[ISQED 2021] H.Peng, S.Huang, T.Geng, A.Li, W.Jiang, H.Liu, S.Wang, C.Ding: "Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning", the 22nd International Symposium on Quality Electronic Design.
[ASAP 2021] H.Peng, ..., T.Geng, ..., C.Ding: "Binary Complex Neural Network Acceleration on FPGA", the 32nd IEEE International Conference on Application speciﬁc Systems, Architectures and Processors.
[ASAP 2021] C.Tan, T.Geng, ..., A.Tumeo: "OpenCGRA: Democratizing Coarse-Grained Reconfigurable Arrays", the 32nd IEEE International Conference on Application speciﬁc Systems, Architectures and Processors.
[HPEC 2021] P.Haghi, A.Guo, T.Geng, ..., M.Herbordt: Workload Imbalance in HPC Applications: Effect on Performance of In-Network Processing, IEEE High Performance Extreme Computing Conference. (Best Student Paper Award)
[HPEC 2021] C.Wu, S.Bandara, T.Geng, ..., M.Herbordt: System-Level Modeling of GPU/FPGA Clusters for Molecular Dynamics Simulations, IEEE High Performance Extreme Computing Conference.
[MICRO 2020] T.Geng, A.Li, T.Wang, C.Wu, Y.Li, ..., M.Herbordt: "AWB-GCN: A Hardware Accelerator of GraphConvolution-Network through Runtime Workload Rebalancing", the 53rd IEEE/ACM International Symposium on Microarchitecture.
[ICS 2020] T.Geng*, R.Shi*, P.Dong*, ..., M.Herbordt, A.Li, Y.Wang: "CSB-RNN: A Faster-than-Realtime RNN Acceleration Framework with Compressed Structured Blocks", the 34th ACM International Conference on Supercomputing.
[HPEC 2020] T.Geng, C.Wu, C.Tan, B.Fang, A.Li, M.Herbordt: "CQNN: a CGRA-based QNN Framework", IEEE High Performance Extreme Computing Conference.
[FCCM 2020] P.Haghi, T.Geng, T.Wang, A. Guo, M.Herbordt: "FP-AMG: FPGA-Based Acceleration Framework for Algebraic Multigrid Solvers", the 29th IEEE International Symposium On Field-Programmable Custom Computing Machines.
[FPT 2020] P.Haghi, A. Guo, T.Geng, ..., M.Herbordt: "A Reconfigurable Compute-in-the-Network FPGA Assistant for High-Level Collective Support with Distributed Matrix Multiply Case Study", 2020 International Conference on Field-Programmable Technology.
[HPEC 2020] C.Wu, T.Geng, V.Sachdeva, W.Sherman, M.Herbordt: "A Communication-Efficient Multi-Chip Design for Range-Limited Molecular Dynamics", IEEE High Performance Extreme Computing Conference.
[HPEC 2020] P.Haghi, A.Guo, ..., T.Geng, J.Broaddus, R.Marshall, A.Skjellum, M.Herbordt: "FPGAs in the Network and Novel Communicator Support Accelerate MPI Collectives", IEEE High Performance Extreme Computing Conference.
[SC 2019] A.Li, T.Geng, T.Wang, M.Herbordt, S.Song, K.Barker: "BSTC: A Novel BinarizedSoft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets", Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis.
[SC 2019] C.Yang, T.Geng, T.Wang, ..., M.Herbordt: "Fully integrated FPGA molecular dynamics simulations", Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis.
[ICS 2019] T.Geng, T.Wang, C.Wu, C.Yang, W.Wu, A.Li, M.Herbordt: "O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning", the 33th ACM International Conference on Supercomputing.
[ASAP 2019] T.Geng, T.Wang, ..., M.Herbordt: "LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism", the 30th IEEE International Conference on Application speciﬁc Systems, Architectures and Processors.
[FCCM 2019] T.Wang, T.Geng, X.Jin, M.Herbordt: "FP-AMR: A Reconﬁgurable Fabric Framework for Block-Structured Adaptive Mesh Reﬁnement Applications", the 28th IEEE International Symposium On Field-Programmable Custom Computing Machines.
[FCCM 2019] Q.Xiong, C.Yang, R.Xu, R.Patel, T.Geng, A.Skjellum, M.Herbordt: "GhostSZ: A Transparent SZ Lossy Compression Framework with FPGAs", the 28th IEEE International Symposium On Field-Programmable Custom Computing Machines.
[ASAP 2019] C.Yang, T.Geng, T.Wang, J.Sheng, ... M.Herbordt: "Molecular Dynamics Range-Limited Force Evaluation Optimized for FPGAs", the 30th IEEE International Conference on Application speciﬁc Systems, Architectures and Processors.
[ASAP 2019] T.Wang, T.Geng, X.Jin, M.Herbordt: "Accelerating AP3M-Based Computational Astrophysics Simulations with Reconﬁgurable Clusters", the 30th IEEE International Conference on Application speciﬁc Systems, Architectures and Processors.
[FPL 2018] T.Geng, T.Wang, A.Sanaullah, C.Yang, R.Patel, M.Herbordt: "A Framework for Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters with Work and Weight Load Balancing", the 28th International Conference on Field-Programmable Logic and Applications.
[FCCM 2018] T.Geng, T.Wang, A.Sanaullah, C.Yang, R.Xu, R.Patel, M.Herbordt: "FPDeep: Acceleration and Load Balancing of CNN Training on FPGA Clusters", the 27th IEEE International Symposium On Field-Programmable Custom Computing Machines.
[HPEC 2018] T.Geng, E.Diken, T.Wang, L.Jozwiak, M.Herbordt: "An Access-Pattern-Aware On-Chip Vector Memory System with Automatic Loading for SIMD Architecture", IEEE High Performance Extreme Computing Conference.
[HPEC 2018] Z.Xiang, T.Wang, T.Geng, ..., M.Herbordt: "Soft-Core, Multiple-Lane, FPGAbased ADCs for a Liquid Helium Environment", IEEE High Performance Extreme Computing Conference.
[DSD 2016] T.Geng, L.Waeijen, M.Peemen, H.Corporaal, Y.He: "MacSim: A MAC-Enabled HighPerformance SIMD Architecture for Deep Learning", the 19th Euromicro Conference on Digital System Design.
[SAMOS 2016] Y.He, M.Peemen, L.Waeijen, ..., H.Corporaal, T.Geng: "A Conﬁgurable SIMD Architecture with Explicit Datapath for CNN", International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation.