Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
563697 | Signal Processing | 2014 | 13 Pages |
•The metric evaluates video fusion methods for their spatial–temporal consistency.•The metric shows high stability and robustness in a noisy environment.•Spatial–temporal phase congruency is employed as a feature to be compared.•3D zero-mean normalized cross-correlation is employed as a similarity measure.•The spatial–temporal structural tensor is employed to define required weights.
Most image or video fusion quality metrics are designed to evaluate different video fusion methods for spatial–temporal extraction. And there is limited research on the evaluation of spatial–temporal consistency. In this paper, a video fusion quality metric is proposed to evaluate different fusion methods for spatial–temporal consistency, where spatial–temporal phase congruency is employed as a feature to be compared and 3D zero-mean normalized cross-correlation is employed as the similarity measure. Firstly, the spatial–temporal phase congruency maps for input and fused videos are computed using a set of predefined 3D Log-Gabor filters. Then the spatial–temporal phase congruency maps are divided into many non-overlapped spatial–temporal blocks and a local block-based quality metric is defined by performing 3D zero-mean normalized cross-correlation on the relevant spatial–temporal phase congruency maps of the input and fused videos. Finally, the global quality metric is constructed as the weighted average of all the block-based quality metrics. The required local and global weights are defined by the spatial–temporal structure tensor. Several sets of experiments demonstrate the validity and feasibility of the proposed metric. Moreover, the proposed metric shows higher stability and robustness than some other metrics in a noisy environment.