MegaFlow: Zero-Shot Large Displacement Optical Flow

1ETH Zurich    2Microsoft
MegaFlow Teaser

MegaFlow excels at large displacement optical flow and point tracking. (a) On the Sintel (Final) benchmark, MegaFlow consistently achieves the lowest End-Point Error (EPE), with its advantage widening significantly on large displacements. (b) MegaFlow also demonstrates superior zero-shot point tracking results on TAP-Vid. (c) Visuals and inset error maps further illustrate our state-of-the-art results.

Overview Video

Pipeline Overview

MegaFlow Pipeline

Given an input sequence, a frozen DINO and a trainable CNN extract dense patch tokens and local structural features. Alternating frame and global attention, followed by feature fusion, process these tokens into a globally consistent representation. Pair-wise global matching then computes initial flows. Finally, a recurrent module iteratively refines the initial flows using spatial convolutions and temporal attention for sub-pixel accuracy. Crucially, our design seamlessly processes variable-length inputs and extend to point tracking without architectural modifications.

Optical Flow Results

Point Tracking Results

* Displaying 1/16 points. Note: As a zero-shot tracking application of our flow model, point visibility is not explicitly predicted, resulting in tracking through occlusions.

BibTeX

@inproceedings{zhang2026megaflow,
  title     = {MegaFlow: Zero-Shot Large Displacement Optical Flow},
  author    = {Zhang, Dingxi and Wang, Fangjinhua and Pollefeys, Marc and Xu, Haofei},
  booktitle = {arXiv preprint arXiv:},
  year      = {2026}
}