[Zurück]


Beiträge in Tagungsbänden:

A. Glinserer, M. Lechner, A. Wendt:
"Automated Pruning of Neural Networks for Mobile Applications";
in: "2021 IEEE 19th International Conference on Industrial Informatics (INDIN)", herausgegeben von: IEEE; IEEE, 2021, ISBN: 978-1-7281-4395-8.



Kurzfassung englisch:
Pruning is useful method to compress neural networks and further reduce the required computations and thus the inference speed. This work presents an automatic pruning workflow using an measurement based method to determine which portions of the network only contribute little to the total accuracy. Furthermore to increase the pruneability within networks containing residual blocks this work evaluates zero-padding as an useful complement to existing pruning methods. With zero-padding added to the pruning, we enable the automatic pruning process to also choose layers for pruning which would otherwise not be possible or only possible with removing additional filters which might contribute to the total accuracy. Zero-padding therefore adds the removed channels back into the original output feature map in a manner that the shapes remain identical, but the computations are saved. Using this method we achieved a speedup of up to 21% on CPU based platforms and 5-6% on GPU based execution on a MobileNetV2. The pruned network became comparable to an original network with an applied depth multiplier with only little additional retraining time.

Schlagworte:
Optimization, Latency, Accuracy, Machine learning, Embedded machine learning


"Offizielle" elektronische Version der Publikation (entsprechend ihrem Digital Object Identifier - DOI)
http://dx.doi.org/10.1109/INDIN45523.2021.9557525

Elektronische Version der Publikation:
https://publik.tuwien.ac.at/files/publik_302100.pdf


Erstellt aus der Publikationsdatenbank der Technischen Universität Wien.