学术论文

A Deep Learning Inference Scheme Based on Pipelined Matrix Multiplication Acceleration Design and Non-uniform Quantization

来源：arXiv发布日期：2021-10-10作者：Yuyang Zhang, Dik Hin Leung, Min Guo, Yijia Xiao, Haoyue Liu

内容摘要

Matrix multiplication is the bedrock in Deep Learning inference application. When it comes to hardware acceleration on edge computing devices, matrix multiplication often takes up a great majority of the time. To achieve better performance in edge computing, we introduce a low-power Multi-layer Perceptron (MLP) accelerator based on a pipelined matrix multiplication scheme and a nonuniform quantization methodology. The implementation is running on Field-programmable Gate Array (FPGA) devices and tested its performance on handwritten digit classification and Q-learning tasks. Results show that our method can achieve better performance with fewer power consumption.

中文翻译

使用 AI 将内容摘要翻译为中文，便于快速阅读

使用 AI 分析这篇文章的核心发现、关键要点和深度见解

由 DeepSeek AI 提供分析 · 首次使用需配置 API Key