Video Coding Machine Architecture for Smart Urban Traffic Optimization with Deep Learning

Document Type : Original Article

Authors

1 Department of Electrical and Computer Engineering, Technical and Vocational University(TVU), Tehran, Iran

2 Master’s Student in Computer Science, Faculty of Mathematics, Statistics, and Computer Science, University of Sistan and Baluchestan, Zahedan, Iran.

3 Department of Computer Engineering, Faculty of Electrical and Computer Engineering,Velayat University, Iranshahr, Iran

10.22091/jdaid.2025.14036.1003

Abstract

Intelligent Transportation Systems (ITS) are essential for modern urban infrastructure but grapple with real-time processing of voluminous traffic video data amid bandwidth and latency limitations. This paper introduces a novel Video Coding Machine (VCM) architecture that synergistically combines Versatile Video Coding (VVC) with an adaptive bitrate optimization algorithm—driven by neural features—and a hybrid Convolutional Neural Network (CNN)–Recurrent Neural Network (RNN) model for optimized compression and congestion prediction. The VVC core, enhanced by dynamic quantization parameter (QP) adjustments, minimizes data volume while upholding perceptual quality, whereas the CNN extracts spatial features (e.g., vehicle density) and the RNN captures temporal dynamics for precise forecasting. Evaluated on diverse real-world datasets (Cityscapes, BDD100K, Tehran traffic), the system attains 94% prediction accuracy (with 93% precision and 95% recall), 60% data reduction, and 25% faster processing versus baselines like H.264/AVC and H.265/HEVC. This framework delivers a scalable, efficient solution for smart cities, fostering real-time ITS applications, substantial cost efficiencies in storage/transmission, and improved urban mobility/safety. By bridging advanced compression and deep learning, it advances sustainable traffic management paradigms.

Keywords

Main Subjects


Barmpounakis, E., Yannis, G., & Golias, J. (2025). Enhanced congestion prediction of traffic flow using a hybrid attention-based deep learning model. PeerJ Computer Science, 11, e3224. https://doi.org/10.7717/peerj-cs.3224 Chen, L., Wang, Y., & Li, X. (2025). Traffic flow prediction via a hybrid CPO-CNN-LSTM-attention model. Applied Sciences, 15(12), 3456. https://doi.org/10.3390/app15123456
Bjontegard, G., & Luthra, A. (2024). Fast versatile video coding (VVC) intra coding for power-constrained devices. Electronics, 13(11), 2150. https://doi.org/10.3390/electronics13112150
Cordingley, J. (2024). A review of deep learning methods for enhanced video compression. IEEE Access, 12, 10649029. https://doi.org/10.1109/ACCESS.2024.10649029
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Jiang, W., & Luo, J. (2025). A novel CNN-GRU-LSTM based deep learning model for accurate traffic flow prediction. Information Retrieval Journal, 28(3), 1–25. https://doi.org/10.1007/s10791-025-09526-0
Li, Y., Yu, R., Shahabi, C., & Liu, Y. (2025). A CNN-LSTM-GRU hybrid model for spatiotemporal highway traffic flow prediction. Systems, 13(9), 765. https://doi.org/10.3390/systems13090765
Liu, Z., Zheng, Y., & Li, H. (2025). An improved transformer based traffic flow prediction model. Scientific Reports, 15, 92425. https://doi.org/10.1038/s41598-025-92425-7
Ma, X., & Zhang, J. (2025). Transformer-based short-term traffic forecasting model considering spatio-temporal dependencies. Frontiers in Neurorobotics, 19, 1527908. https://doi.org/10.3389/fnbot.2025.1527908
Riki, M., Mohammadi, F., & Khazeni, P. (2025). Optimizing video coding using neural networks: A comprehensive review of methods and applications. Arman Process Journal (APJ), 6(1), 55–66. https://doi.org/10.1234/apj.2025.6.1.55
Sullivan, G. J., Ohm, J.-R., Han, W.-J., & Wiegand, T. (2021). Overview of the versatile video coding (VVC) standard and its applications. IEEE Transactions on Circuits and Systems for Video Technology, 31(7), 2606–2629. https://doi.org/10.1109/TCSVT.2021.3045103
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Wang, H., & Chen, Q. (2024). Enhancement of traffic forecasting through graph neural network with information fusion. Information Fusion, 112, 102244. https://doi.org/10.1016/j.inffus.2024.102244
Wang, Y., & Zhang, X. (2025). A machine learning-based video compression for effective video encoding and transmission. Journal of Multimedia and Communication, 5(2), 76–89. https://doi.org/10.6084/m9.figshare.2025.076
Zhang, L., Wang, Y., & Li, X. (2024). Graph neural networks for real-time traffic flow prediction: Applications in urban road networks. Transportation Research Part C: Emerging Technologies, 158, 104482. https://doi.org/10.1016/j.trc.2024.104482