Edge-AI-Paper-List
Edge-AI-Paper-List copied to clipboard
:warning: This repository is not maintained actively. Checkout our survey paper on efficient LLM and the corresponding paper list.
Edge-AI-Paper-List
Target venues: system conferences (OSDI/SOSP/ATC/EuroSys/ASPLOS), network conferences (NSDI/SIGCOMM) and mobile conferences (MobiCom/MobiSys/SenSys/UbiComp).
We will keep maintaining this list :)
Note: Edge here refers to resource-constrained devices, not edge servers; AI here mostly refers to deep learning.
Attention: we are maintaining a dedicated paper list for resource-efficient LLM algorithms/systems.
Smartphones
2023
[ASPLOS'23] TelaMalloc: Efficient On-Chip Memory Allocation for Production Machine Learning Accelerators
2022
[MobiSys'22] FabToys: Plush Toys with Large Arrays of Fabric-based Pressure Sensors to Enable Fine-grained Interaction Detection [MobiSys'22] Floo: Automatic, Lightweight Memoization for Faster Mobile Apps [MobiCom'22] A-Mash: Providing Single-App Illusion for Multi-App Use through User-centric UI Mashup [MobiCom'22] Tutti: Coupling 5G RAN and Edge Computing for Latency-critical Video Analytics
2021
[MobiCom'21] AsyMo: scalable and efficient deep-learning inference on asymmetric mobile CPUs [MobiCom'21] Elf: accelerate high-resolution mobile deep vision with content-aware parallel offloading [MobiCom'21] UltraSE: single-channel speech enhancement using ultrasound [MobiCom'21] Experience: a five-year retrospective of MobileInsight [MobiCom'21] LegoDNN: block-grained scaling of deep neural networks for mobile vision [MobiSys'21] Tap: an app framework for dynamically composable mobile systems [MobiSys'21] zTT: learning-based DVFS with zero thermal throttling for mobile devices [ATC'21] Octo: INT8 Training with Loss-aware Compensation and Backward Quantization for Tiny On-device Learning
2020
[MobiCom'20] Deep Learning Based Wireless Localization for Indoor Navigation [MobiCom'20] SPINN: Synergistic Progressive Inference of Neural Networks over Device and Cloud [MobiCom'20] Heimdall: Mobile GPU Coordination Platform for Augmented Reality Applications [MobiCom'20] NEMO: Enabling Neural-enhanced Video Streaming on Commodity Mobile Devices [MobiCom'20] OnRL: Improving Mobile Video Telephony via Online Reinforcement Learning [ASPLOS'20] PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning [MobiSys'20] Deep Compressive Offloading: Speeding up Neural Network Inference by Trading Edge Computation for Network Latency [MobiSys'20] Fast and scalable In-memory Deep Multitask Learning via Neural Weight Virtualization [MobiSys'20] MDLdroidLite: A Release-and-inhibit Control Approach to Resource-efficient Deep Neural Networks on Mobile Devices [MobiSys'20] RF-net: A Unified Meta-learning Framework for RF-enabled One-shot Human Activity Recognition [SenSys'20] MobiPose: real-time multi-person pose estimation on mobile devices
2019 and before
[MobiCom'19] RNN-Based Room Scale Hand Motion Tracking [MobiCom'19] MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile Processors [EuroSys'19] µLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization [SenSys'19] DeepAPP: A Deep Reinforcement Learning Framework for Mobile Application Usage Prediction [MobiCom'18] DeepCache: Principled Cache for Mobile Deep Vision [MobiCom'18] NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision [MobiCom'18] FoggyCache: Cross-Device Approximate Computation Reuse [MobiSys'18]On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework [MobiSys'18]FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices [MobiSys'17] Accelerating Mobile Audio Sensing Algorithms through On-Chip GPU Offloading [MobiSys'17] MobileDeepPill: A Small-Footprint Mobile Deep Learning System for Recognizing Unconstrained Pill Images [MobiSys'17] DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware [MobiSys'17] DeepMon: Building Mobile GPU Deep Learning Models for Continuous Vision Applications [ASPLOS'17] Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge [Ubicomp'16] SpotGarbage: Smartphone App to Detect Garbage Using Deep Learning [Ubicomp'15] DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning
AR/VR
[MobiCom'23] AccuMO: Accuracy-Centric Multitask Offloading in Edge-Assisted Mobile Augmented Reality [MobiCom'22] SalientVR: saliency-driven mobile 360-degree video streaming with gaze information [MobiCom'21] Face-Mic: inferring live speech and speaker identity via subtle facial dynamics captured by AR/VR motion sensors [MobiSys'21] Xihe: a 3D vision-based lighting estimation framework for mobile augmented reality [MobiSys'21] LensCap: split-process framework for fine-grained visual privacy control for augmented reality apps [ASPLOS'20] Coterie: Exploiting Frame Similarity to Enable High-Quality Multiplayer VR on Commodity Mobile Devices [MobiCom'19] Edge Assisted Real-time Object Detection for Mobile Augmented Reality [EuroSys'19] Transparent AR Processing Acceleration at the Edge [ASPLOS'21] Q-VR: System-Level Design for Future Collaborative Virtual Reality Rendering [ATC'20] Firefly: Untethered Multi-user VR for Commodity Mobile Devices
IoTs
2023
[MobiCom'23] Re-thinking computation offload for efficient inference on IoT devices with duty-cycled radios [NSDI'23] Gemel: Model Merging for Memory-Efficient, Real-Time Video Analytics at the Edge [ASPLOS'23] Space-Efficient TREC for Enabling Deep Learning on Microcontrollers [ASPLOS'23] STI: Turbocharge NLP Inference at the Edge via Elastic Pipelining [ASPLOS'23] HuffDuff: Stealing Pruned DNNs from Sparse Accelerators [HPCA'23] GROW: A Row-Stationary Sparse-Dense GEMM Accelerator for Memory-Efficient Graph Convolutional Neural Networks [HPCA'23] Mix-GEMM: An efficient HW-SW Architecture for Mixed-Precision Quantized Deep Neural Networks Inference on Edge Devices [HPCA'23] FlowGNN: A Dataflow Architecture for Real-Time Workload-Agnostic Graph Neural Network Inference [ISCA'23] Inter-layer Scheduling Space Definition and Exploration for Tiled Accelerators
2022
[MobiSys'22] TEO: Ephemeral Ownership for IoT Devices to Provide Granular Data Control [MobiSys'22] TinyNet: a Lightweight, Modular, and Unified Network Architecture for the Internet of Things [MobiSys'22] Bringing WebAssembly to Resource-constrained IoT Devices for Seamless Device-Cloud Integration [MobiCom'22] RetroIoT: Retrofitting Internet of Things Deployments by Hiding Data in Battery Readings [Mobisys'22] DeepMix: Mobility-aware, Lightweight, and Hybrid 3D Object Detection for Headsets [ATC'22] CoVA: Exploiting Compressed-Domain Analysis to Accelerate Video Analytics [EuroSys'22] LiteReconfig: Cost and Content Aware Reconfiguration of Video Object Detection Systems for Mobile GPUs [SenSys'22] AutoMatch: Leveraging Traffic Camera to Improve Perception and Localization of Autonomous Vehicles [NeurIPS'22] On-Device Training Under 256KB Memory
2021
[ATC'21] Video Analytics with Zero-streaming Cameras [ATC'21] Fine-tuning giant neural networks on commodity hardware with automatic pipeline model parallelism [ATC'21] Palleon: A Runtime System for Efficient Video Processing toward Dynamic Class Skew [ASPLOS'21] Rhythmic Pixel Regions: Visual sensing architecture for flexible spatiotemporal resolution towards high-precision visual computing at low power [NSDI'21] AIRCODE: Hidden Screen-Camera Communication on an Invisible and Inaudible Dual Channel [NSDI'21] MAVL: Multiresolution Analysis of Voice Localization
2020
[MobiSys'20] Approximate Query Service on Autonomous IoT Cameras [MobiSys'20] EMO: Real-time Emotion Recognition From Single-eye Images for Resource-constrained Eyewear Devices [MobiCom'20] CLIO: Enabling Automatic Compilation of Deep Learning Pipelines Across IoT and Cloud [MobiCom'20] EagleEye: Wearable Camera-based Person Identification in Crowded Urban Spaces [SigComm'20] Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics [EuroSys'20] Balancing efficiency and fairness in heterogeneous GPU clusters for deep learning [OSDI'20] A Unified Architecture for Accelerating Distributed DNN Training in Heterogeneous GPU/CPU Clusters [OSDI'20] PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications [OSDI'20] Serving DNNs like Clockwork: Performance Predictability from the Bottom Up
2019 and before
[MobiCom'19] Source Compression with Bounded DNN Perception Loss for IoT Edge Computer Vision [SenSys'19] Neuro.ZERO: A Zero-energy Neural Network Accelerator for Embedded Sensing and Inference Systems [Ubicomp'19] Performance Characterization of Deep Learning Models for Breathing-based Authentication on Resource-Constrained Devices [ASPLOS'18] SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing [SenSys'17] DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework [MobiSys'17] Glimpse: A Programmable Early-Discard Camera Architecture for Continuous Mobile Vision [Ubicomp'17] Low-resource Multi-task Audio Sensing for Mobile and Embedded Devices via Shared Deep Neural Network Representations [MobiSys'16] MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints [SenSys'16] Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables
Energy-harvested devices
[MobiCom'23] LUT-NN: Empower Efficient Neural Network Inference with Centroid Learning and Table Lookup [MobiCom'23] AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments [MobiSys'20] Approximate Query Service on Autonomous IoT Cameras [SenSys'20] Ember: Energy Management of Batteryless Event Detection Sensors with Deep Reinforcement Learning [ASPLOS'19] Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems [ASPLOS'21] Quantifying the Design-Space Tradeoffs in Autonomous Drones [ASPLOS'21] Rhythmic Pixel Regions: Visual sensing architecture for flexible spatiotemporal resolution towards high-precision visual computing at low power [ATC'22] PilotFish: Harvesting Free Cycles of Cloud Gaming with Deep Learning Training
Privacy&Security
2023
[MobiCom'23] Efficient Federated Learning for Modern NLP [MobiCom'23] Federated Few-shot Learning for Mobile NLP [MobiCom'23] Enc2: Privacy-Preserving Inference for Tiny IoTs via Encoding and Encryption [MobiCom'23] AutoFed: Heterogeneity-Aware Federated Multimodal Learning for Robust Autonomous Driving
2022
[MobiCom'22] Audio-domain Position-independent Backdoor Attack via Unnoticeable Triggers [MobiCom'22] Sifter: Protecting Security-Critical Kernel Modules in Android through Attack Surface Reduction [ASPLOS'22] Eavesdropping User Credentials via GPU Side Channels on Smartphones [NSDI'22] Privid: Practical, Privacy-Preserving Video Analytics Queries [EuroSys'22] Minimum Viable Device Drivers for ARM TrustZone [ATC'22] PRIDWEN: Universally Hardening SGX Programs via Load-Time Synthesis [ATC'22] HyperEnclave: An Open and Cross-platform Trusted Execution Environment [OSDI'22] BlackBox: A Container Security Monitor for Protecting Containers on Untrusted Operating Systems [OSDI'22] Blockaid: Data Access Policy Enforcement for Web Applications
2021
[MobiCom'21] PECAM: privacy-enhanced video streaming and analytics via securely-reversible transformation [MobiSys'21] SafetyNOT: on the usage of the SafetyNet attestation API in Android [MobiSys'21] Rushmore: securely displaying static and animated images using TrustZone [OSDI'21] Privacy Budget Scheduling [OSDI'21] Addra: Metadata-private voice communication over fully untrusted infrastructure [OSDI'21] MAGE: Nearly Zero-Cost Virtual Memory for Secure Computation (Awarded Best Paper!) [OSDI'21] Zeph: Cryptographic Enforcement of End-to-End Data Privacy
2020 and before
[MobiCom'20] FaceRevelio: A Face Liveness Detection System for Smartphones with A Single Front Camera [ASPLOS'20] DNNGuard: An Elastic Heterogeneous DNN Accelerator Architecture against Adversarial Attacks [Ubicomp'20] Countering Acoustic Adversarial Attacks in Microphone-equipped mart Home Devices [Ubicomp'19] DeepType: On-Device Deep Learning for Input Personalization Service with Minimal Privacy Concern [Ubicomp'19] Keyboard Snooping from Mobile Phone Arrays with Mixed Convolutional and Recurrent Neural Networks [MobiCom'19] Occlumency: Privacy-preserving Remote Deep-learning Inference Using SGX [EuroSys'19] Forward and Backward Private Searchable Encryption with SGX [SOSP'19] Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform [SOSP'19] Honeycrisp: Large-scale Differentially Private Aggregation Without a Trusted Core [SOSP'19] Yodel: Strong Metadata Security for Voice Calls
Learning
Strikethrough indicates that these papers may have nothing to do with mobile
2023
[ICLR'23] MocoSFL: enabling cross-client collaborative self-supervised learning [EuroSys'23] REFL: Resource-Efficient Federated Learning [NSDI'23] FLASH: Towards a High-performance Hardware Acceleration Architecture for Cross-silo Federated Learning [NSDI'23]RECL: Responsive Resource-Efficient Continuous Learning for Video Analytics
2022
[MICRO'22] GCD2: A Globally Optimizing Compiler for Mapping DNNs to Mobile DSPs [MobiSys'22] mGEMM: Low-latency Convolution with Minimal Memory Overhead Optimized for Mobile Devices [MobiSys'22] Band: Coordinated Multi-DNN Inference on Heterogeneous Mobile Processors [MobiSys'22] CoDL: Efficient CPU-GPU Co-execution for Deep Learning Inference on Mobile Devices [MobiSys'22] FedBalancer: Data and Pace Control for Efficient Federated Learning on Heterogeneous Clients [MobiSys'22] Memory-efficient DNN Training on Mobile Devices [MobiSys'22] Melon: Breaking the Memory Wall for Resource-Efficient On-Device Machine Learning [MobiCom'22] Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI [MobiCom'22] Romou: Rapidly Generate High-Performance Tensor Kernels for Mobile GPUs [MobiCom'22] InFi: end-to-end learnable input filter for resource-efficient mobile-centric inference [MobiCom'22] PyramidFL: A Fine-grained Client Selection Framework for Efficient Federated Learning [MobiCom'22] Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading [MobiCom'22] NeuLens: Spatial-based Dynamic Acceleration of Convolutional Neural Networks on Edge [MobiCom'22] RF-URL: Unsupervised Representation Learning for RF Sensing [MobiCom'22] Cosmo: Contrastive Fusion Learning with Small Data for Multimodal Human Activity Recognition [SenSys'22] BlastNet: Exploiting Duo-Blocks for Cross-Processor Real-Time DNN Inference [SenSys'22] PriMask: Cascadable and Collusion-Resilient Data Masking for Mobile Cloud Inference [UbiComp'22] Context-Aware Compilation of DNN Training Pipelines across Edge and Cloud [OSDI'22] Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning [ATC'22] Campo: Cost-Aware Performance Optimization for Mixed-Precision Neural Network Training [ATC'22] SOTER: Guarding Black-box Inference for General Neural Networks at the Edge [EuroSys'22] Varuna: Scalable, Low-cost Training of Massive Deep Learning Models (Best Paper Award)
2021
[MobiCom'21] Hermes: an efficient federated learning framework for heterogeneous mobile clients [MobiSys'21] PPFL: privacy-preserving federated learning with trusted execution environments [MobiSys'21] ClusterFL: a similarity-aware federated learning system for human activity recognition [MobiSys'21] nn-Meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices [SenSys'21] FedDL: Federated Learning via Dynamic Layer Sharing for Human Activity Recognition [SenSys'21] Mercury: Efficient On-Device Distributed DNN Training via Stochastic Importance Sampling [SenSys'21] FedMask: Joint Computation and Communication-Efficient Personalized Federated Learning via Heterogeneous Masking [NSDI'21] Mistify: Automating DNN Model Porting for On-Device Inference at the Edge [OSDI'21] Oort: Efficient Federated Learning via Guided Participant Selection [ATC'21] Jump-Starting Multivariate Time Series Anomaly Detection for Online Service Systems [ATC'21] Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
2020 and before
[MobiCom'20] Billion-scale Federated Learning on Mobile Clients: a submodel design with tunable privacy [OSDI'20] A Tensor Compiler for Unified Machine Learning Prediction Serving [SenSys'19] MetaSense: Few-shot Adaptation to Untrained Conditions in Deep Mobile Sensing [UbiComp'18] DeepType: On-Device Deep Learning for Input Personalization Service with Minimal Privacy Concern
Another awesome paper list about Federated Learning