Loading...
Research Focus

Neural Image & Video Coding

We research end-to-end optimized neural network-based image and video compression technologies that go beyond traditional hand-crafted codecs (AVC, HEVC, VVC, etc.). Beyond compression gains, we directly confront the core barrier to practical deployment – computational complexity.

By leveraging learned image coding, we aim to replace conventional hand-crafted approaches and achieve significantly more efficient compression of visual data.

Key Approaches

  • Tool-based Approach: Easily adaptable to existing codec architectures with straightforward extension to inter-coding.
  • End-to-End Approach: A completely new design that demonstrates excellent performance in image compression, with ongoing research into video extension.
  • Quantization: Applying hardware-friendly quantization to learned compression models — maintaining rate-distortion performance while reducing inference latency for practical deployment.
Neural Image & Video Coding
Research Timeline

Research Highlights

Block-based Learned Image Compression
2024–Present CVPR 2026 accepted

State-of-the-art LIC models process entire feature maps as network inputs, making peak memory a hard bottleneck for high-resolution content. Block-based processing mitigates this but triggers blocking artifacts.

We mathematically derive the minimum overlap required to reproduce Full-image LIC results exactly, modeling how overlap propagates layer-by-layer through a CNN using a recursive formula. Several implementation techniques are applied on top of this theoretical foundation.

The result: artifact-free reconstruction with zero BD-rate loss compared to Full-image inference, while significantly reducing peak memory and peak MACs across 2K and 4K resolutions.

Lightweight Building Blocks (SCM & SC-Gate)
2024–Present –15.84% BD-rate vs. VTM 9.1 and NNVC SR tool adoption

We design lightweight backbone blocks deployable in two separate contexts.

SCM (for NNVC / standard codec tools): A Spatial-then-Channel Mixing block applied to NN-based super-resolution and in-loop filtering. The NN-based super-resolution tool has been adopted into the NNVC codec software, and the in-loop filtering tool has been advanced to the Exploration Experiment (EE) stage within the JVET standardization process — a formal milestone toward standard inclusion.

SC-Gate (for LIC): Eliminates all global modeling (Self-attention, Mamba, Bi-RWKV) and uses a single depth-wise convolution for spatial mixing combined with an element-wise gating mechanism. At 28.8% of LALIC’s per-block complexity, it achieves –15.84% BD-rate vs. VTM 9.1, with gains validated on high-resolution datasets (Tecnick 1K, CLIC up to 2K).

Generative Compression
2025–

MSE-optimized codecs produce over-smoothed, perceptually unrealistic reconstructions at low bitrates. The key insight is that a compressed image should (1) lie on the manifold of real images (Realism) while (2) remaining consistent with the original (Fidelity).

We mathematically formulate the transform and quantization characteristics of AI-based codecs and derive explicit constraints for diffusion model sampling. This constrained diffusion decoding keeps reconstructions on the real-image manifold without sacrificing semantic consistency — outperforming existing generative codecs in fidelity while matching their realism.

Publications

Related Papers & Contributions

TitleVenueYear
Block Based Learned Image Compression without Blocking ArtifactsThe IEEE/CVF Conference on Computer Vision and Pattern Recognition 20262026
[NNVC] AhG11: NNSR with new backbone block based on Spatial-Channel Mixing (SCM)39th ISO/IEC JTC 1/SC 29 JVET2025
[NNVC] AhG11: Training NNSR using Reparameterization and Progressive Activation38th ISO/IEC JTC 1/SC 29 JVET2025
AHG11: VLOP3 with new backbone block based on Spatial-Channel Mixing (SCM)40th JVET of ITU-T SG21 WP3/21 and ISO/IEC JTC 1/SC 292025
Crosscheck of JVET-AN0238 (EE1-4.2: Cross-component enhanced NNSR)40th JVET of ITU-T SG21 WP3/21 and ISO/IEC JTC 1/SC 292025
EE1-4.1: NNSR with new backbone block based on Spatial-Channel Mixing (SCM)40th JVET of ITU-T SG21 WP3/21 and ISO/IEC JTC 1/SC 292025
Spatial-Channel Mixing Block for Neural Network-based Video Coding (NNVC) ToolsIEEE International Conference on Visual Communications and Image Processing. 2025.2025
JPEG-AI의 패치 기반 처리 방법 및 블로킹 아티팩트 방지 조건에 대한 고찰2024년 한국방송·미디어공학회 하계학술대회2024
NNVC 인루프 필터를 이용한 블록 기반 종단간 이미지 압축 모델의 블로킹 아티팩트 제거 연구2024년 한국방송·미디어공학회 하계학술대회2024
Towards Efficient Image Compression Without Autoregressive ModelsNeural Information Processing Systems2023
사인파 활성화 함수를 적용한 합성곱 신경망 기반JPEG 압축 영상 디블로킹 연구2023년 한국방송·미디어공학회 하계학술대회2023
신경망 기반 블록 단위 위상 홀로그램 이미지 압축2023년 1월 홀로그래픽 신호처리 특집호 방송공학회논문지2023DOI
크기조정을 활용한 신경망 기반의 이미지 압축2023년 한국방송·미디어공학회 하계학술대회2023
신경망 기반 비디오 압축을 위한 레이턴트 정보의 방향 이동 및 보상방송공학회논문지2022DOI
신경망 이미지 부호화 모델과 초해상화 모델의 합동훈련2022년 한국방송·미디어공학회 하계학술대회 학부생 논문2022
적응적 크기 조정을 이용한 블록 기반 신경망 이미지 부호화2022년 한국방송·미디어공학회 하계학술대회 학부생 논문2022
Back to About