snibgo's ImageMagick pages

Pixel match

The pixmatch process module locates pixel from one image in another, creating a displacement map.

With ImageMagick, given an input image and a displacement map, we can create an output image with pixels moved as specified in the map. This solves the inverse problem: given an input image and a desired output, how can we create a suitable map?

The method considers the input pixels as being the central pixels of overlapping windows. The best match for each window in found in the desired output, and the coordinates of the central pixel in the matching output window gives the required displacement..

This resembles the prime factorization problem. The process imageA + map => imageB is simple and quick. We can fairly easily find the inverse map for imageB + map' => imageA. The inverse process imageA + imageB => map is either simple and accurate but slow, or fast but complex and approximate.

References

The pixmatch module was largely inspired by:

The main difference is that my approach is concerned with pixels rather than patches.

Section 3 of that paper describes an "approximate nearest neighbor algorithm", which I have included in the module as "propagate and random trial" or "PART" for short. When describing the random search, the PatchMatch paper says:

Note the search window must be clamped to the bounds of B.

It seems to me that this will bias searches, and hence found results and displacement maps, to the edge of the reference image. It may be better to generate another random coordinate until the bounds are not exceeded. I have implemented both methods, as a compile-time option.

See also:

Shift-map seems to (if I understand correctly) try all possible displacement maps, seeking the one that minimises the overall cost. In my terms, the cost would be the RMSE difference. Shift-map also has terms for keeping, moving or removing pixels, and a term requiring smoothness in the map. It starts at a coarse level (up to 100x100 pixels), working up to fine detail, apparently in factors of two. At each level, it needs only a 3x3 search for each pixel. The smoothness term allows it to optimise hole-filling at the global level, rather than just individual windows, so that looks interesting. It also does a grow-cut operation on the (relative) displacement map to segment it.

Shift-map is aimed at applications that remove parts of the image while retaining desired objects. It avoids distorting objects, causing sudden "jumps" in unimportant parts of the image. That is good for some applications, but the bench example below is the opposite: we want the object to be distorted.

The process

We have two input images: a reference, and a source. For every small sample in source, we want to know where it occurs in reference. Each small sample may be a pixel in isolation, but we usually care more about pixels in context.

To be more specific: we define a window around every pixel in source. Each of these windows is a subimage. The task is now to find the best match for every subimage within reference. This is trivially simple, and could be done in a simple script: for every window, crop it out and do an IM "-subimage-search". I think this is the only method that is guaranteed to find the best solution.

The problem is that reference and source might contain 35 million pixels each. We would need 35 million subimage-searches, each in an image of 35 million pixels. This would take my laptop about eight years to complete. We need a faster method, even if it doesn't guarantee to find the best solution.

The process module pixmatch provides a variety of methods for the task. There are many configuration options because different applications have different needs. We can answer questions like: what parts of this picture match what parts of that picture, and how well do they match?

The module executes a number of phases. Remember that the task is to find, for every source pixel in context, the corresponding reference pixel, and that we do this by finding the best matching reference window for every source window.

  1. Assign a random reference coordinate to every source pixel. Compare each source window with the assigned reference window. This gives the initial score for each source pixel.
  2. If a third image is provided, this gives a reference coordinate for each source pixel. Calculate the window comparison score this would give; if it is an improvement, this is the new reference coordinate and score for the source pixel.
  3. Optionally, for each source window, search for a better reference window. This uses one of the search methods: entire, random or skip. Specify search none to omit this phase. The default is search entire.
  4. Optionally, run a propagate and random trial ("PART") phase. This visits all the source pixels in scanline order, then again in reverse scanline order. For each source pixel that hasn't yet been matched (its score isn't "good enough") it tries to propagate a better score from an adjacent source pixel, and then it tries a few random reference windows to see if they can improve the score. This phase is iterated until no more improvements are found, or the option max_iter is reached, or all pixels have a "good enough" score. The default is part on.
  5. An optional sweeping-up phase. Any source pixels that do not have a "good enough" score undergo an exhaustive search in the entire reference image. The default is sweep off.
  6. The reference coordinates and scores are written to an image the same size as the source. This can be used as the third image for another run of the process module, or as a displacement map.

If there is no third image, and search none and part off and sweep off, the process will be very quick but the result will be poor.

Arguments

The module takes a list of two or three images as input, and replaces them with a single image, which is a combined displacement map and score for each pixel. The input images are:

  1. the reference image;
  2. the source image;
  3. an optional displacement map.

The reference and source images can be any sizes. The input displacement map, if given, must be the same size as the source. For the input displacement map, only the red and green channels are relevant. Other channels are ignored.

The output will be a displacement map the same size as the source. The red and green channels will give the horizontal and vertical displacements that would populate the source image from the reference. The blue channel will give the RMSE score for the window centred on that pixel.

When only the reference and source are provided, the limit_source_radius option is used only for the propagation. When the third input image is given, the red and green channels are used as initial estimates for the displacements, and searches will be made within the limit_search_radius.

The output image from one invocation of pixmatch can be used as the third input to another.

Option Description
Short
form
Long form
wr N window_radius N Radius of window for searches, >= 0.
Window will be D*D square where D = radius*2+1.
Zero is a valid radius, giving a 1x1 window.
Default = 1.
lsr N limit_search_radius N Limit radius to search for source, >= 0.
Default = 0 = no limit
st N similarity_threshold N Stop searching for a window when RMSE <= N, when a match is "good enough".
Typically use 0.0 <= N <= 1.0, eg 0.01.
Default = 0 = keep going to find the best match.
hc N hom_chk N Homogeneity check.
N is either a number, typically 0 < N < 0.5, or off to remove the check.
Default = 0.1
e X search_to_edges X Whether to search for matches to image edges, so the result will be influenced by virtual pixels.
Set to on or off.
Default = on
s X search X Mode for searching, where X is none or entire or random or skip.
Default = entire.
rs N rand_searches N For random searches, the number to make per attempted match.
Default = 0 = minimum of search field width and height
sn N skip_num N For skip searches, the number of positions to skip in x and y directions.
Default = 0 = window radius
ref X refine X Whether to refine random and skip searches.
Set to on or off.
Default = on
part X propagate_and_random_trial X Whether to seek matches by propagation and random trials (after search X phase).
Set to on or off.
Default = on
mpi N max_part_iter N Maximum number of PART iterations.
Set to 0 for no maximum. Default = 10
tanc X terminate_after_no_change X Whether to terminate PART iterations when no changes are made.
Set to on or off.
Default = on
swp X sweep X Whether to attempt to match any unmatched pixels by exhaustive (slow) search (after PART phase).
Set to on or off.
Default = off
dpt X displacement_type X Type of displacement map for input and output.
Set to absolute or relative.
Default = absolute
w filename write filename Write the an output image file after each PART iteration.
Example filename: frame_%06d.png
ac N auto_correct N After finding the best score for a pixel (using shortcut methods),
if the RMSE >= N, try again with no shortcuts.
Typically use 0.0 <= N < 1.0, eg 0.10.
Default = 0 = no auto correction.
v verbose Write some text output to stderr.

The search options are similar to those for the fillholes module.

Options can be given in any order. Later options will override earlier options. Options are read and validated before processing begins.

For good performance, the module should generally be used with a similarity_threshold. This is because the code continually searchs for better matches. (By contrast, fillholes tries all possible matches and returns the best one, however good or bad.)

Either type of displacement map may be used. The absolute type, which is the default, will be slightly faster. If using the output from one pixmatch as the input to another, be sure to use the same type.

As general guidance, I suggest:

limit_search_radius, lsr, is used during the search phase, and the propagate component of the PART phase. The random component of the PART phase, and the sweep phase, ignore lsr.

Small photographs

We use two photographs of a park bench, taken at around the same time. They are resized to exactly 600x400 pixels, and we work on those small versions. We will find a displacement map that distorts the first image into the second.

set LGE_BENCH_1=%PICTLIB%20150622\AGA_2378_sRGB.tiff

%IM%convert ^
  %LGE_BENCH_1% ^
  -resize "600x400^!" ^
  -strip ^
  pm_ph1.tiff
pm_ph1.tiffjpg
set LGE_BENCH_2=%PICTLIB%20150622\AGA_2379_sRGB.tiff

%IM%convert ^
  %LGE_BENCH_2% ^
  -resize "600x400^!" ^
  -strip ^
  pm_ph2.tiff
pm_ph2.tiffjpg

These have not been sharpened to look good at this size. Sharpening would contaminate future processsing, and should be done as the final stage.

We try a single iteration of the PART phase:

 %IMDEV%convert ^
  -seed 1234 ^
  pm_ph1.tiff ^
  pm_ph2.tiff ^
  -process 'pixmatch search none st 0.01 mpi 1 v' ^
  pm_phm1.png >pm_phm1.lis 2^>^&1

call StopWatch 
0 00:00:11
%IM%convert ^
  pm_phm1.png ^
  -separate ^
  pm_phm1s.png

%IM%convert ^
  pm_ph1.tiff ^
  pm_phm1.png ^
  -compose Distort -set option:compose:args 100%%x100%% -composite ^
  pm_phd1.tiff
pixmatch options:
  window_radius 1  limit_search_radius 0  similarity_threshold 0.01  hom_chk 0.1
  search none
  propagate_and_random_trial on  max_part_iter 1  tanc on
  sweep off
  displacement_type absolute  verbose
chklist (pixmatchImage) chkL A   List length 2  chklist OK
---------------------------------------
pixmatch: ref 600x400  src 600x400
CalcScores: nCompares=240000
OnePart: nChanges=366836 nToMatch=214574
IterParts: nIter=1
IterParts: nCompares=19036279
nTotalCompares=19276279
chkentry (MkDispImage) mdi   List length 1  chkentry OK
chkentry (MkDispImage) mdi2   List length 1  chkentry OK
chkentry (pixmatch) be   List length 1  chkentry OK
chklist (pixmatchImage) after rem   List length 1  chklist OK
chklist (pixmatchImage) after rem list   List length 1  chklist OK

Here is the displacement map, pm_phm1.png:

pm_phm1.pngjpg

Here are the separated colour channels of the displacement map:

pm_phm1s-0.pngjpg pm_phm1s-1.pngjpg pm_phm1s-2.pngjpg

Here is the result from applying the displacement map to the first photo:

pm_phd1.tiffjpg

The result is similar to the second photo, as required.

The displacement map shows streaks between top-left and bottom-right. I think this is a consequence of the propagation only operating between those corners. I don't know if this should be changed, eg by also propagating between top-right and bottom-left. Would this improve the result? (As the streaks are removed by later iterations, probably nothing needs to be done.)

Show some statistics:

%IM%convert ^
  pm_phm1.png ^
  -format "MEAN=%%[fx:mean.b]\nMAXIMA=%%[fx:maxima.b]\n" ^
  info: 
MEAN=0.0384027
MAXIMA=0.345434
%IM%compare -metric RMSE pm_ph2.tiff pm_phd1.tiff NULL:  
2814.63 (0.0429484)

The mean of the blue channel of the map gives a comparable number to IM's usual RMSE metric. It isn't the same because pixmatch calculates the difference between windows, whereas IM's RMSE calculates the RMSE of individual pixels.

As a rule of thumb, if two ordinary photographs have an ImageMagick RMSE difference of better than (lower than) 1%, and the difference is evenly spread, I can't see the difference between them. This is why I set the similarity_threshold to 0.01. There is no point spending processing time trying to improve on that.

This result is worse than that. The nToMatch tells us that of the 240,000 pixels, 215,000 of them have not reached the 0.01 threshold. Visually, the displaced image is both noisy and blurry, most noticable at the edges of the wooden bench. However, it is a close approximation to the required image.

We can use as many stages as we want, with whatever multiple of resize we want. A multiple of two means the next level up needs to search only plus or minus 1, hence lsr 1. Experiments show that -filter cubic is the best filter for both speed and quality of results (the algorithm is less likely to get trapped into wrong solutions).

set ST=st 0.01

set PART=search none part on

%IMDEV%convert ^
  -seed 1234 ^
  pm_ph1.tiff ^
  pm_ph2.tiff ^
  -filter cubic ^
  ( -clone 0-1 ^
    -resize "75x50^!" ^
    -process 'pixmatch search none part off sweep on v' ^
  ) ^
  ( -clone 0-2 ^
    -resize "150x100^!" ^
    -process 'pixmatch lsr 1 %ST% %PART% mpi 40 v' ^
  ) ^
  -delete 2 ^
  ( -clone 0-2 ^
    -resize "300x200^!" ^
    -process 'pixmatch lsr 1 %ST% %PART% mpi 20 v' ^
  ) ^
  -delete 2 ^
  ( -clone 0-2 ^
    -resize "600x400^!" ^
    -process 'pixmatch lsr 1 %ST% %PART% search entire v' ^
  ) ^
  -delete 2 ^
  -delete 0-1 ^
  pm_phm2.png

call StopWatch 
0 00:00:52
%IM%convert ^
  pm_phm2.png ^
  -separate ^
  pm_phm2s.png

%IM%convert ^
  pm_ph1.tiff ^
  pm_phm2.png ^
  -compose Distort -set option:compose:args 100%%x100%% -composite ^
  pm_phd2.tiff
pm_phm2.pngjpg
pm_phm2s-0.pngjpg pm_phm2s-1.pngjpg pm_phm2s-2.pngjpg
pm_phd2.tiffjpg

The directional streakiness of the map has gone. In the blue channel, the lightest parts are not as light.

Visually, the displaced image is greatly improved. The arms of the bench are now smooth, without obvious flaws. The image is less noisy, though it is still obvious on the top edge of the bench.

%IM%convert ^
  pm_phm2.png ^
  -format "MEAN=%%[fx:mean.b]\nMAXIMA=%%[fx:maxima.b]\n" ^
  info: 
MEAN=0.0221053
MAXIMA=0.159609
%IM%compare -metric RMSE pm_ph2.tiff pm_phd2.tiff NULL:  
2801.86 (0.0427536)

The blue channel maxima has halved. Hence, the worst pixels in the displacement are not as bad. But processing time has increased four-fold. Are the improvements worth the cost? That depends on the application.

From one call of pixmatch to the next, we are moving from low-frequency detail to a higher level. We are moving down a level of a Gaussian pyramid, from coarse detail to finer detail. Within each invocation of pixmatch, scores always improve. But scores will generally become worse from one call to the next. We have introduced detail that wasn't present at the coarser level. At one level all the pixels may be perfect, and at the next they may all be terrible. How much effort should we make at one level seeking perfection? How do we know when it is good enough?

Starting at a coarser level (37x25 pixels) makes no difference to speed or quality.

At the 75x50 level, reducing mpi from 100 to 10 shortens the processing from 30 to 26 seconds, and makes no difference to the MEAN nor IM's RMSE. However, it worsens the MAXIMA score from 0.18 to 0.22.

At the next level, 150x100, increasing mpi from the default 10 to 50 just takes a little longer without affecting quality.

Script pixMatch.bat

The convert command for multi-scale matching of large images becomes complex, especially as reference and source may be different sizes. So, based on the above and many other trials, here is a meta-script pixMatch.bat. It creates and runs a script with the convert command that reads reference and source, and creates the displacement map. It starts at the coarsest scale (the top of the Gaussian pyramid) where the minimum dimension is 50 pixels. It does a full sweep at the coarsest level, then PART phases for intermediate levels, and finishes with both search and PART phases.

For example:

:skip
set pmSEED=-seed 1234

call %PICTBAT%pixMatch ^
  pm_ph1.tiff ^
  pm_ph2.tiff ^
  pm_scr1.png

call StopWatch 
0 00:01:10
%IM%convert ^
  pm_scr1.png ^
  -separate ^
  pm_scr1s.png

%IM%convert ^
  pm_ph1.tiff ^
  pm_scr1.png ^
  -compose Distort -set option:compose:args 100%%x100%% -composite ^
  pm_scr1d.tiff
pm_scr1.pngjpg
pm_scr1s-0.pngjpg pm_scr1s-1.pngjpg pm_scr1s-2.pngjpg
pm_scr1d.tiffjpg
%IM%convert ^
  pm_scr1.png ^
  -format "MEAN=%%[fx:mean.b]\nMAXIMA=%%[fx:maxima.b]\n" ^
  info: 
MEAN=0.0341334
MAXIMA=0.257267
%IM%compare -metric RMSE pm_ph2.tiff pm_scr1d.tiff NULL:  
2518.17 (0.0384248)

Here is the script that was created and run:

( f:\prose\PICTURES\pm_ph1.tiff +write mpr:REF )
( pm_ph2.tiff +write mpr:SRC )
-filter cubic -seed 1234
( -filter cubic
( mpr:REF -resize 75x50! )
( mpr:SRC -resize 75x50! )
-process 'pixmatch search none part off sweep on '
+write mpr:MAP3
)
( -filter cubic
( mpr:REF -resize 150x100! )
( mpr:SRC -resize 150x100! )
( mpr:MAP3 -resize 150x100! )
-process 'pixmatch lsr 1 st 0.01 search none part on mpi 90 '
+write mpr:MAP2
)
-delete 2
( -filter cubic
( mpr:REF -resize 300x200! )
( mpr:SRC -resize 300x200! )
( mpr:MAP2 -resize 300x200! )
-process 'pixmatch lsr 1 st 0.01 search none part on mpi 30 '
+write mpr:MAP1
)
-delete 2
( -filter cubic
( mpr:REF )
( mpr:SRC )
( mpr:MAP1 -resize 600x400! )
-process 'pixmatch lsr 1 st 0.01 search entire part on mpi 10 '
+write mpr:MAP0
)
-delete 2
-delete 0-1

Large photographs

We can apply the above techniques to large photographs.

%IM%identify %LGE_BENCH_1% 
%IM%identify %LGE_BENCH_2% 
F:\pictures\20150622\AGA_2378_sRGB.tiff TIFF 7378x4924 7378x4924+0+0 16-bit sRGB 190.2MB 0.000u 0:00.000
F:\pictures\20150622\AGA_2379_sRGB.tiff TIFF 7378x4924 7378x4924+0+0 16-bit sRGB 189.2MB 0.016u 0:00.015

Above, we prepared a map at a small scale of these images, so we can resize that and use it as the input to another stage of pixmatch, at full scale.

%IMDEV%convert ^
  %LGE_BENCH_1% ^
  %LGE_BENCH_2% ^
  ( pm_phm2.png -resize "7378x4924^!" ) ^
  -process 'pixmatch lsr 4 mpi 5 v' ^
  -compress Zip ^
  %LGE_MAP_DIR%pm_bench_lge_map.tiff

The above command does 25 billion comparisons in 2 hours 20 minutes, and is not executed each time this page is rebuilt. Some other choice of stages and options may reduce the comparisons and time. (Yes. The script takes 1.5 hours.)

For the web, we show a small resize of this large displacement map, and three full-size crops.

set CRP1=-crop 500x300+1223+1090 +repage
set CRP2=-crop 500x300+5322+1920 +repage
set CRP3=-crop 500x300+6500+3515 +repage

%IM%convert ^
  %LGE_MAP_DIR%pm_bench_lge_map.tiff ^
  ( +clone -resize "600x400^!" ^
    +write pm_bench_lge_map_sm.png ^
    +delete ) ^
  ( +clone %CRP1% ^
    +write pm_bench_lge_map_crp1.png ^
    +delete ) ^
  ( +clone %CRP2% ^
    +write pm_bench_lge_map_crp2.png ^
    +delete ) ^
  ( +clone %CRP3% ^
    +write pm_bench_lge_map_crp3.png ^
    +delete ) ^
  NULL:
pm_bench_lge_map_sm.pngjpg
pm_bench_lge_map_crp1.pngjpg pm_bench_lge_map_crp2.pngjpg pm_bench_lge_map_crp3.pngjpg

We then use this map to displace LGE_BENCH_1 into an imitation of LGE_BENCH_2.

%IM%convert ^
  %LGE_BENCH_1% ^
  %LGE_MAP_DIR%pm_bench_lge_map.tiff ^
  -compose Distort -set option:compose:args 100%%x100%% -composite ^
  pm_bench_lge_mock2.tiff

How close is the imitation to the real thing?

%IM%compare -metric RMSE pm_bench_lge_mock2.tiff %LGE_BENCH_2% NULL:  
1161.15 (0.0177181)

Not great, but not bad.

We show a reduced-size version, and three crops from the same location as the map.

%IM%convert ^
  pm_bench_lge_mock2.tiff ^
  ( +clone -resize "600x400^!" ^
    +write pm_bench_lge_mock2_sm.png ^
    +delete ) ^
  ( +clone %CRP1% ^
    +write pm_bench_lge_mock2_crp1.png ^
    +delete ) ^
  ( +clone %CRP2% ^
    +write pm_bench_lge_mock2_crp2.png ^
    +delete ) ^
  ( +clone %CRP3% ^
    +write pm_bench_lge_mock2_crp3.png ^
    +delete ) ^
  NULL:
pm_bench_lge_mock2_sm.pngjpg
pm_bench_lge_mock2_crp1.pngjpg pm_bench_lge_mock2_crp2.pngjpg pm_bench_lge_mock2_crp3.pngjpg

For comparison here are crops from the same locations in LGE_BENCH_2.

%IM%convert ^
  %LGE_BENCH_2% ^
  ( +clone -resize "600x400^!" ^
    +write pm_bench_lge_real2_sm.png ^
    +delete ) ^
  ( +clone %CRP1% ^
    +write pm_bench_lge_real2_crp1.png ^
    +delete ) ^
  ( +clone %CRP2% ^
    +write pm_bench_lge_real2_crp2.png ^
    +delete ) ^
  ( +clone %CRP3% ^
    +write pm_bench_lge_real2_crp3.png ^
    +delete ) ^
  NULL:
pm_bench_lge_mock2_sm.pngjpg
pm_bench_lge_mock2_crp1.pngjpg pm_bench_lge_mock2_crp2.pngjpg pm_bench_lge_mock2_crp3.pngjpg

To more clearly see the imperfections of the imitation, we scale up the plaque at the back of the bench from both the imitation and the real LGE_BENCH_2.

%IM%convert ^
  pm_bench_lge_mock2.tiff ^
  -crop 100x37+3155+993 +repage -scale 800%% ^
  pm_plaq_mock2.png
pm_plaq_mock2.pngjpg
%IM%convert ^
  %LGE_BENCH_2% ^
  -crop 100x37+3155+993 +repage -scale 800%% ^
  pm_plaq_real2.png
pm_plaq_real2.pngjpg

From pixels to patches

We use the small versions of the bench image.

 %IMDEV%convert ^
  -seed 1234 ^
  pm_ph1.tiff ^
  pm_ph2.tiff ^
  pm_phm2.png ^
  -process 'pixmatch st 0.1 search none wr 5 v' ^
  pm_pt_map.png >pm_pt_map.lis 2^>^&1

call StopWatch 
0 00:01:17
%IM%convert ^
  pm_pt_map.png ^
  -separate ^
  pm_pt_maps.png

%IM%convert ^
  pm_ph1.tiff ^
  pm_pt_map.png ^
  -compose Distort -set option:compose:args 100%%x100%% -composite ^
  pm_pt1.tiff
pixmatch options:
  window_radius 5  limit_search_radius 0  similarity_threshold 0.1  hom_chk 0.1
  search none
  propagate_and_random_trial on  max_part_iter 10  tanc on
  sweep off
  displacement_type absolute  verbose
chklist (pixmatchImage) chkL A   List length 3  chklist OK
---------------------------------------
pixmatch: ref 600x400  src 600x400
ReadPrevResult
CalcScores: nCompares=240000
OnePart: nChanges=91783 nToMatch=37862
OnePart: nChanges=10145 nToMatch=36646
OnePart: nChanges=3816 nToMatch=36196
OnePart: nChanges=2568 nToMatch=35861
OnePart: nChanges=1945 nToMatch=35633
OnePart: nChanges=1469 nToMatch=35448
OnePart: nChanges=1227 nToMatch=35307
OnePart: nChanges=1079 nToMatch=35188
OnePart: nChanges=859 nToMatch=35097
OnePart: nChanges=800 nToMatch=35035
IterParts: nIter=10
IterParts: nCompares=28048535
nTotalCompares=28288535
chkentry (MkDispImage) mdi   List length 1  chkentry OK
chkentry (MkDispImage) mdi2   List length 1  chkentry OK
chkentry (pixmatch) be   List length 1  chkentry OK
chklist (pixmatchImage) after rem   List length 1  chklist OK
chklist (pixmatchImage) after rem list   List length 1  chklist OK

Here is the displacement map, pm_pt_map.png:

pm_pt_map.pngjpg

Here are the separated colour channels of the displacement map:

pm_pt_maps-0.pngjpg pm_pt_maps-1.pngjpg pm_pt_maps-2.pngjpg

Here is the result from applying the displacement map to the first photo:

pm_pt1.tiffjpg

The result is similar to the second photo, as required.

We should trial the resize filters to see which one gives result closest to final map.

Dissimilar images

In examples above, reference and source have been similar images, such as two photographs of the same bench. They can be dissimilar, such as a photograph of gravel and the Mona Lisa. blah

IM's -compose Distort ... -composite makes an output the same size as the first input, which will be the gravel. We need it to be the same size as the Mona Lisa.

One solution is to extend both images to be the same size. (This is a kludge. A better solution would be for IM to have the option of taking metadata from the second input.) We can extend either before pixmatch, which will then waste time processing the extensions, or after pixmatch but fiddle with the result.

Application: morphing

How do we morph? If we simply blend the map with an identity map, we pull pixels from the wrong parts of the image. We want the converse, some kind of blended push map.

-process invDispMap should do the trick. Invert the map, fill holes, blend it as required with the identity map, then invert again and fill holes. If we use the "wrong" map, no need for first inversion and filling of holes.

The inverted displacement map has too many holes. Sure, they can be filled, but the result is horrible.

We can generate both maps which are, in theory, inverses of each other.

B=disp(A,mapAB)
A=disp(B,mapBA)
mapBA=inv(mapAB)  but the process leaves holes in mapBA

C=disp(A,mapAC)

We don't know mapAC. We have calculated its inverse, mapCA, but the inverse of mapCA has holes.
We can also calculate mapCB.

identity=disp(mapAB,mapBA)

In general:
mapXZ = disp(mapXY,mapYZ)

hence:
mapCB=disp(mapCA,mapAB)  we know all three.


mapAC=disp(mapAB,mapBC)  but we don't know mapBC.

Application: lassoing

Application: image analogies

One image in the style of another. Make a sketch from the first bench photo. Using the map, displace that into a sketch of the second photo.

Texture transfer: Take a photo of a banana and another of an orange, in the same lighting conditions. Create an image that has the shape of the banana but colour and texture of the orange, and vice versa.

Perhaps more difficult: transfer just the texture or just the colour, but not both.

Application: simple animation

Video some action. Hand-draw a version of the first frame. By image analogies, make versions of all the other frames.

Open questions

In theory, better results may be obtained when the two input images to pixmatch are in L*a*b* colorspace, not sRGB. How much difference does this make?

If using Lab, this is the general technique:

convert ^
  ref.png ^
  src.png ^
  -colorspace Lab ^
  -process pixmatch ^
  -set colorspace sRGB ^
  map.png

Converting the inputs to Lab should find matches that a human would agree with. ImageMagick sees that the first input to pixmatch is Lab, so it assumes the output is also Lab. When saving to a PNG file, it would convert this to sRGB, so we lie to IM and pretend the image is already sRGB. Then IM does no conversion.

When using multi-scale stages, the window radius should probably be different at the different scales.

What is the optimum skip number that minimises the number of searches?

Future

Currently, for comparison purposes, all pixels in the window are given the same weights. I may add the option to weight them in a Gaussian fashion.

For the auto_correct or sweep, code might find the worst match and try to find a better one, and keep going in decreasing order of badness until none change.

The module can be used to find patches instead of pixels: scale the result down, then back up. This discards all results except for the central pixel of each window. It wastes most of the processing that has been done. The module could be modified to avoid doing the processing, so source windows were non-overlapping: all those for loops that walk through source pixels would skip over the window width, and the result from each central pixel copied to all other pixels in the window.

The displacement map may refer many source pixels to one reference pixel. For some purposes, it would be useful to restrict each reference to a single source. But then we have the problem of how to globally optimise this.

We might prioritise the source pixels.

It might be interesting to derive a smooth displacement map, so that output pixels in close proximity are generally derived from input pixels also in close proximity. This would stretch features of the bench and grass, with displacement discontinuities at object boundaries. Applying a median filter to the map might be interesting.

The matching could be invariant to brightness and contrast. Before searching, calculate the mean and SD of the subimage. For each candidate window, calculate its mean and SD, thus the required gain and bias to match with the subimage. Apply this gain and bias to each pixel when calculating differences. See Gain and bias. Put this as a compile-time option in match.inc and compwind.inc.

Scripts

For convenience, .bat scripts are also available in a single zip file. See Zipped BAT files.

pixMatch.bat

rem From images %1 and %2,
rem creates a displacement map, optional name %3.
@rem
@rem Also uses:
@rem   pmSEED eg "-seed 1234". If not set, PART phase will give different results each time.
@rem   pmVERIFY set to v or whatever pixmatch flags are desired.
@rem   pmMIN_DIM_INITIAL Don't make coarsest level have smaller dimension than this.
@rem   pmDIM_FACT size factor between levels.
@rem   pmMPI_FINAL mpi setting for final PART phase.
@rem   pmMPI_FACT multiplier for mpi seting for previous phases.
@rem
@rem This creates a script, and runs it with the IM v6 "@" mechanism.
@rem It could be modified for v7 by moving the output file from the script to the command.


@if "%2"=="" findstr /B "rem @rem" %~f0 & exit /B 1

@setlocal

@call echoOffSave

call %PICTBAT%setInOut %1 pm


set REF=%INFILE%
set SRC=%2

if not "%3"=="" set OUTFILE=%3

if "%pmLSR%"=="" set pmLSR=1
if "%pmST%"=="" set pmST=0.01

rem Currently, the following must be integers.
if "%pmMIN_DIM_INITIAL%"=="" set pmMIN_DIM_INITIAL=50
if "%pmDIM_FACT%"=="" set pmDIM_FACT=2
if "%pmMPI_FINAL%"=="" set pmMPI_FINAL=10
if "%pmMPI_FACT%"=="" set pmMPI_FACT=3


set SCR_FILE=%BASENAME%_%sioCODE%.scr

set REF_WW=
for /F "usebackq" %%L in (`%IM%identify ^
  -format "REF_WW=%%w\nREF_HH=%%h\n" ^
  %REF%`) do set %%L
if "%REF_WW%"=="" exit /B 1

set SRC_WW=
for /F "usebackq" %%L in (`%IM%identify ^
  -format "SRC_WW=%%w\nSRC_HH=%%h\n" ^
  %SRC%`) do set %%L
if "%SRC_WW%"=="" exit /B 1

set MIN_WW=%REF_WW%
set MIN_HH=%REF_HH%
if %MIN_WW% GTR %SRC_WW% set MIN_WW=%SRC_WW%
if %MIN_HH% GTR %SRC_HH% set MIN_HH=%SRC_HH%

set MIN_DIM=%MIN_WW%
if %MIN_DIM% GTR %MIN_HH% set MIN_DIM=%MIN_HH%

echo MIN_WW=%MIN_WW% MIN_HH=%MIN_HH% MIN_DIM=%MIN_DIM%

set REF_WW_0=%REF_WW%
set REF_HH_0=%REF_HH%
set SRC_WW_0=%SRC_WW%
set SRC_HH_0=%SRC_HH%
set CNT=0
set Cm1=0
set nMPI_0=%pmMPI_FINAL%

:loop

  set /A CNT+=1
  set /A REF_WW_%CNT%=!REF_WW_%Cm1%!/%pmDIM_FACT%
  set /A REF_HH_%CNT%=!REF_HH_%Cm1%!/%pmDIM_FACT%
  set /A SRC_WW_%CNT%=!SRC_WW_%Cm1%!/%pmDIM_FACT%
  set /A SRC_HH_%CNT%=!SRC_HH_%Cm1%!/%pmDIM_FACT%
  set /A nMPI_%CNT%=!nMPI_%Cm1%!*%pmMPI_FACT%
  set /A Cm1+=1
  set /A MIN_DIM/=%pmDIM_FACT%

if %MIN_DIM% GEQ %pmMIN_DIM_INITIAL% goto loop

set /A Cm1=%CNT%-1

echo CNT=%CNT% Cm1=%Cm1%

(
  echo ^( %REF% +write mpr:REF ^)
  echo ^( %SRC% +write mpr:SRC ^)
  echo -filter cubic %pmSEED%

  for /L %%I in (%Cm1%,-1,0) do (
    set N=2
    if %%I==%Cm1% set N=1

    echo ^( -filter cubic

    if %%I==0 (
      echo ^( mpr:REF ^)
      echo ^( mpr:SRC ^)
    ) else (
      echo ^( mpr:REF -resize !REF_WW_%%I!x!REF_HH_%%I!^^! ^)
      echo ^( mpr:SRC -resize !SRC_WW_%%I!x!SRC_HH_%%I!^^! ^)
    )

    set /A Ip1=%%I+1

    if %%I LSS %Cm1% echo ^( mpr:MAP!Ip1! -resize !SRC_WW_%%I!x!SRC_HH_%%I!^^! ^)

    if %%I==%Cm1% (
      set pmOPT=search none part off sweep on %pmVERIFY%
    ) else if %%I==0 (
      set pmOPT=lsr %pmLSR% st %pmST% search entire part on mpi !nMPI_%%I! %pmVERIFY%
    ) else (
      set pmOPT=lsr %pmLSR% st %pmST% search none part on mpi !nMPI_%%I! %pmVERIFY%
    )

    echo -process 'pixmatch !pmOPT!'
    echo +write mpr:MAP%%I
    echo ^)

    if %%I LSS %Cm1% echo -delete 2
  )

  echo -delete 0-1
  rem echo %OUTFILE%
) >%SCR_FILE%

set SRC_
set REF_
set nMPI_

echo SCR_FILE=%SCR_FILE%
type %SCR_FILE%

%IMDEV%convert -filter cubic %pmSEED% @%SCR_FILE% %OUTFILE%

call echoRestore

endlocal & set pmOUTFILE=%OUTFILE%

All images on this page were created by the commands shown, using:

%IM%identify -version
Version: ImageMagick 6.9.5-3 Q16 x86 2016-07-22 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2015 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC Modules OpenMP 
Delegates (built-in): bzlib cairo flif freetype jng jp2 jpeg lcms lqr openexr pangocairo png ps rsvg tiff webp xml zlib

Source file for this web page is pixmatch.h1. To re-create this web page, execute "procH1 pixmatch".


This page, including the images, is my copyright. Anyone is permitted to use or adapt any of the code, scripts or images for any purpose, including commercial use.

Anyone is permitted to re-publish this page, but only for non-commercial use.

Anyone is permitted to link to this page, including for commercial use.


Page version v1.0 11-Dec-2015.

Page created 29-Dec-2017 01:53:20.

Copyright © 2017 Alan Gibson.