Download PDFOpen PDF in browser

EdgeSP: Scalable Multi-Device Parallel DNN Inference on Heterogeneous Edge Clusters

EasyChair Preprint no. 7483

17 pagesDate: February 19, 2022


Edge computing has emerged as a promising line of research for processing large-scale data and providing low-latency services. Unfortunately, deploying deep neural networks (DNNs) on resource-limited edge devices presents unacceptable latency, hindering artificial intelligence from empowering edge devices. Prior solutions attempted to address this issue by offloading workload to the remote cloud. However, the cloud-assisted approach ignores that devices in the edge environment tend to exist as clusters. In this paper, we propose EdgeSP, a scalable multi-device parallel DNN inference framework that maximizes resource utilization of heterogeneous edge device clusters. We design a multiple fused-layer blocks parallelization strategy to reduce inter-device communication during parallel inference. Further, we add early exit branches to DNNs, empowering the device to trade-off latency and accuracy for a variety of sophisticated tasks. Experimental results show that EdgeSP enables inference latency acceleration of 2.3x-3.7x for DNN inference tasks of various scales and outperforms the existing naive parallel inference method. Additionally, EdgeSP can provide high accuracy inference services under various latency requirements.

Keyphrases: Deep Neural Networks, early exit, Edge Computing, Edge Intelligence, Internet of Things, Parallel Inference

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Zhipeng Gao and Shan Sun and Yinghan Zhang and Zijia Mo and Chen Zhao},
  title = {EdgeSP: Scalable Multi-Device Parallel DNN Inference on Heterogeneous Edge Clusters},
  howpublished = {EasyChair Preprint no. 7483},

  year = {EasyChair, 2022}}
Download PDFOpen PDF in browser