PUBLICATIONS

662197_read_512x512.png
Selected Journal Publications
  1. [Ph.D. Dissertation] T.Geng: "FPGA-based high-performance neural network acceleration", Boston University.

  2. [TPDS 2020] T.Geng, T.Wang, C.Wu, Y.Li, ..., A.Li, M.Herbordt: "O3BNN-R: An Out-Of-Order Architecture for HighPerformance and Regularized BNN inference", IEEE Transactions on Parallel and Distributed Systems.

  3. [TC 2020] T.Geng*, T.Wang*, A.Li, X.Jin, M.Herbordt: "FPDeep: Scalable Acceleration of CNN Training on DeeplyPipelined FPGA Clusters", IEEE Transactions on Computers.

Selected Conference Publications
  1. [FCCM 2021] C.Wu, T.Geng, S.Bandara, C.Yang, V.Sachdeva, W.Sherman, M.Herbordt: "Upgrade of FPGA Range-Limited Molecular Dynamics to Handle Hundreds of Processors", the 30th IEEE International Symposium On Field-Programmable Custom Computing Machines.

  2. [FPT 2021] P.Haghi, A.Guo, T.Geng,  ..., M.Herbordt: "A Reconfigurable Compute-in-the-Network FPGA Assistant for High-Level Collective Support with Distributed Matrix Multiply Case Study", the International Conference on Field-Programmable Technology.

  3. [ISQED 2021] H.Peng, S.Huang, T.Geng, A.Li, W.Jiang, H.Liu, S.Wang, C.Ding: "Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning", the 22nd International Symposium on Quality Electronic Design.

  4. [MICRO 2020] T.Geng, A.Li, T.Wang, C.Wu, Y.Li, ..., M.Herbordt: "AWB-GCN: A Hardware Accelerator of GraphConvolution-Network through Runtime Workload Rebalancing", the 53rd IEEE/ACM International Symposium on Microarchitecture.

  5. [ICS 2020] T.Geng*, R.Shi*, P.Dong*, ..., M.Herbordt, A.Li, Y.Wang: "CSB-RNN: A Faster-than-Realtime RNN Acceleration Framework with Compressed Structured Blocks", the 34th ACM International Conference on Supercomputing.

  6. [HPEC 2020] T.Geng, C.Wu, C.Tan, B.Fang, A.Li, M.Herbordt: "CQNN: a CGRA-based QNN Framework", IEEE High Performance Extreme Computing Conference.

  7. [FCCM 2020] P.Haghi, T.Geng, T.Wang, A. Guo, M.Herbordt: "FP-AMG: FPGA-Based Acceleration Framework for Algebraic Multigrid Solvers", the 29th IEEE International Symposium On Field-Programmable Custom Computing Machines.

  8. [HPEC 2020] C.Wu, T.Geng, V.Sachdeva, W.Sherman, M.Herbordt: "A Communication-Efficient Multi-Chip Design for Range-Limited Molecular Dynamics", IEEE High Performance Extreme Computing Conference.

  9. [HPEC 2020] P.Haghi, A.Guo, ..., T.Geng, J.Broaddus, R.Marshall, A.Skjellum, M.Herbordt: "FPGAs in the Network and Novel Communicator Support Accelerate MPI Collectives", IEEE High Performance Extreme Computing Conference.

  10. [SC 2019] A.Li, T.Geng, T.Wang, M.Herbordt, S.Song, K.Barker: "BSTC: A Novel BinarizedSoft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets", Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis.

  11. [SC 2019] C.Yang, T.Geng, T.Wang, ..., M.Herbordt: "Fully integrated FPGA molecular dynamics simulations", Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis.

  12. [ICS 2019] T.Geng, T.Wang, C.Wu, C.Yang, W.Wu, A.Li, M.Herbordt: "O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning", the 33th ACM International Conference on Supercomputing.

  13. [ASAP 2019] T.Geng, T.Wang, ..., M.Herbordt: "LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism", the 30th IEEE International Conference on Application specific Systems, Architectures and Processors.

  14. [FCCM 2019] T.Wang, T.Geng, X.Jin, M.Herbordt: "FP-AMR: A Reconfigurable Fabric Framework for Block-Structured Adaptive Mesh Refinement Applications", the 28th IEEE International Symposium On Field-Programmable Custom Computing Machines.

  15. [FCCM 2019] Q.Xiong, C.Yang, R.Xu, R.Patel, T.Geng, A.Skjellum, M.Herbordt: "GhostSZ: A Transparent SZ Lossy Compression Framework with FPGAs", the 28th IEEE International Symposium On Field-Programmable Custom Computing Machines.

  16. [ASAP 2019] C.Yang, T.Geng, T.Wang, J.Sheng, ... M.Herbordt: "Molecular Dynamics Range-Limited Force Evaluation Optimized for FPGAs", the 30th IEEE International Conference on Application specific Systems, Architectures and Processors.

  17. [ASAP 2019] T.Wang, T.Geng, X.Jin, M.Herbordt: "Accelerating AP3M-Based Computational Astrophysics Simulations with Reconfigurable Clusters", the 30th IEEE International Conference on Application specific Systems, Architectures and Processors.

  18. [FPL 2018] T.Geng, T.Wang, A.Sanaullah, C.Yang, R.Patel, M.Herbordt: "A Framework for Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters with Work and Weight Load Balancing", the 28th International Conference on Field-Programmable Logic and Applications.

  19. [FCCM 2018] T.Geng, T.Wang, A.Sanaullah, C.Yang, R.Xu, R.Patel, M.Herbordt: "FPDeep: Acceleration and Load Balancing of CNN Training on FPGA Clusters", the 27th IEEE International Symposium On Field-Programmable Custom Computing Machines.

  20. [HPEC 2018] T.Geng, E.Diken, T.Wang, L.Jozwiak, M.Herbordt: "An Access-Pattern-Aware On-Chip Vector Memory System with Automatic Loading for SIMD Architecture", IEEE High Performance Extreme Computing Conference.

  21. [HPEC 2018] Z.Xiang, T.Wang, T.Geng, ..., M.Herbordt: "Soft-Core, Multiple-Lane, FPGAbased ADCs for a Liquid Helium Environment", IEEE High Performance Extreme Computing Conference.

  22. [DSD 2016] T.Geng, L.Waeijen, M.Peemen, H.Corporaal, Y.He: "MacSim: A MAC-Enabled HighPerformance SIMD Architecture for Deep Learning", the 19th Euromicro Conference on Digital System Design.

  23. [SAMOS 2016] Y.He, M.Peemen, L.Waeijen, ..., H.Corporaal, T.Geng: "A Configurable SIMD Architecture with Explicit Datapath for CNN", International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation.

  24. [arXiv - Published in MICRO53] T.Geng, A.Li, T.Wang, C.Wu, Y.Li, R.Shi, A.Tumeo, S.Che, S.Reinhardt, M.Herbordt: "UWB-GCN: Accelerating Graph Convolutional Networks through Runtime Workload Rebalancing", arXiv:1908.10834v4.