snibgo's ImageMagick pages

K-clustering

A process module for k-means, fuzzy k-means and k-harmonic, with transparency.

These are colour-reduction methods, like IM's -colors, but different.

When an image has been clustered into a small collection of colours, each pixel is set to the closest colour. The colour used for a cluster may not be a good representative of the image pixels in the cluster, so the process adjusts the colour of each cluster to be more representative, then re-clusters the image to these new pixels, and iterates until colours no longer change, within a specified tolerance.

On this page, "distance" always means the difference between two colours, as measured by three colour channels. It is not concerned with positions of pixels on the image.

References

This page and the process module implements three of the clustering methods described in:

However, the paper doesn't describe how opacity should be processed.

Also relevant:

See also references in the section Jump: finding k below.

Sample inputs

set SRC=toes.png
toes.pngjpg
set SRC_HOL=toes_holed.png
toes_holed.pngjpg

The process module

Option Description
Short
form
Long form
m string method string Method for converging clusters, one of:
    Means "k-means" clustering;
    Fuzzy "fuzzy k-means" clustering;
    Harmonic "k-harmonic" clustering;
    Null no convergence;
Default = means.
k N
k N-M
kCols N
kCols N-M
Number of clusters to find.
Parameter is one integer, or two integers separated by a hyphen.
When two integers, finds k as the best value between N and M inclusive by the jump method.
For example: kCols 10, kCols 5-15.
Default = 10.
j N,M jump N,M DEPRECATED. Find k as the best value between N and M inclusive by the jump method.
For example: jump 1,10
s sdNorm Transform input to equalise channel SDs, and invert this at the end.
a regardAlpha Process opacity.
i string initialize string Initialization for clusters, one of:
    Colors IM's "-colors";
    Forgy chooses random pixels from the image;
    RandPart assigns image pixels to random partitions;
    SdLine uses colours from a short line in the colour cube;
    FromList uses unique colours from last image in list;
Default = colors.
y N jumpY N Y power parameter for "jump" iterations. A positive number.
Default = 0.5, 1.0 or 1.5.
r N fuzzyR N Power parameter r for fuzzy method. A positive number at least 1.0.
Default = 1.5.
p N harmonicP N Power parameter p for harmonic method. A positive number at least 2.0.
Default = 3.5.
t N tolerance N Stop iterating when worst channel change is less than or equal to this.
Default = 0.01.
x N maxIter N Stop after this many iterations.
Default = 100.
w filename write filename Write iterations as image files.
For example: write myfile_%06d.png
string file string Write verbose text to stderr or stdout.
Default: stderr.
v verbose Write some text output to stderr.
v2 verbose2 Write more text output to stderr.
d debug Write debugging text output to stderr.

(Case is not significant.)

The module initializes to a set of kCols colours, and uses a method to iteratively improve the colours.

Instead of specifying an exact number of colours to find, we can find the number that gives the biggest jump in quality. The module will choose the best number for this image within that range.

sdNorm is an option to normalise the standard deviation of channels, so each dimension has the same influence.

Iteration continues until a tolerance is achieved, or when maxIter iterations have been performed. tolerance can be set to zero but arithmetical rounding may prevent a perfect result from being achieved.

Examples

The number shown beneath each image is the RMSE distortion from the input image, on a scale of 0.0 to 1.0, where 0.0 means "no change".

Default parameters:

%IMDEV%convert ^
  %SRC% ^
  -process kcluster ^
  kc_ex1.png
0.0383549
kc_ex1.pngjpg

Default method ("means"), varying kCols, with "-colors" shown for comparison:

-process 'kcluster' -colors
%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster kCols 3' ^
  kc_ex2.png
0.0725123
kc_ex2.pngjpg
%IMDEV%convert ^
  %SRC% ^
  +dither ^
  -colors 3 ^
  kc_exc2.png
0.0780149
kc_exc2.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster kCols 4' ^
  kc_ex4.png
0.0575264
kc_ex4.pngjpg
%IMDEV%convert ^
  %SRC% ^
  +dither ^
  -colors 4 ^
  kc_exc4.png
0.0673186
kc_exc4.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster kCols 5' ^
  kc_ex5.png
0.0493609
kc_ex5.pngjpg
%IMDEV%convert ^
  %SRC% ^
  +dither ^
  -colors 5 ^
  kc_exc5.png
0.0656154
kc_exc5.pngjpg

For the same number of colours, the process module with default settings gives a better score than "-colors".

The other two methods: fuzzy and harmonic.

%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster method Fuzzy' ^
  kc_fz1.png
0.0391296
kc_fz1.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster method Harmonic' ^
  kc_hm1.png
0.0382404
kc_hm1.pngjpg

Normalise SD

Not normalised Normalised
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Means k 5' ^
  kc_km_nsd0.png
0.0493609
kc_km_nsd0.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Means sdNorm k 5' ^
  kc_km_nsd1.png
0.0514201
kc_hm_nsd1.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Fuzzy k 5' ^
  kc_fm_nsd0.png
0.0496455
kc_fm_nsd0.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Fuzzy sdNorm k 5' ^
  kc_fm_nsd1.png
0.0516036
kc_fm_nsd1.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Harmonic k 5' ^
  kc_hm_nsd0.png
0.0497404
kc_hm_nsd0.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Harmonic sdNorm k 5' ^
  kc_hm_nsd1.png
0.0509971
kc_hm_nsd1.pngjpg

The normalised versions show more detail in the skin tones, and less in the grass. Numerically, the normalised versions are slightly less accurate.

The normalisation process implemented here is crude: it calculates the standard deviations of the three colour channels, and divides values in two channels in the appropiate ratio. (With appropriate processing when one, two or three channels have SD=0.)

A more sophisticated method may be more useful.

Initialization

The methods all start from an initial guess at the colours, and refine this by iteration. But they iterate towards a local optimum, not a global optimum, so the result will depend on the first guess.

The default "initialize Colors" uses the equivalent of the CLI -colors k. This isn't guaranteed to find as many colours as requested. When it doesn't, the module will try to fill the slots with random pixels from the image.

"initialize SdLine" uses k colours that lie on a straight line within the colour cube. The coordinates of the ends of this line are (mnR-sdR,mnG-sdG,mnB-sdB) and (mnR+sdR,mnG+sdG,mnB+sdB), where mnR is the mean of the red channel, sdR is the standard deviation of the red channel, and so on. Mean and SD calculations always take alpha into account, eg transparent pixels are ignored. The line is divided into k segments, and the mid-points of those segments are used for the k colours.

"initialize FromList" uses the unique colours from the last image in the current list. The number of unique colours gives k, so kCols has no relevance, and is ignored. Hence the list must contain at least two images. All the images in the list except the last will be clustered.

"initialize Forgy" picks the k colours from k pixels chosen at random from the image, so we get a different initialisation each time (unless we use the same seed).

"initialize RandPart" is the "random partition" method. This assigns each image pixel to a random cluster, and calculates the cluster colour as the mean of its pixels, weighted by alpha. This initialises the clusters to colours that are close to the image mean. Again, we get a different initialisation each time (unless we use the same seed).

Examples of "initialize Colors":

%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Means i Colors k 5' ^
  kc_km_col.png
0.0493609
kc_km_col.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Fuzzy i Colors k 5' ^
  kc_fz_col.png
0.0496455
kc_fz_col.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Harmonic i Colors k 5' ^
  kc_hm_col.png
0.0497404
kc_hm_col.pngjpg

Examples of "initialize SdLine":

%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Means i SdLine k 5' ^
  kc_km_sdl.png
0.0492722
kc_km_sdl.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Fuzzy i SdLine k 5' ^
  kc_fz_sdl.png
0.0497469
kc_fz_sdl.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -process ^
    'kcluster m Harmonic i SdLine k 5' ^
  kc_hm_sdl.png
0.0502853
kc_hm_sdl.pngjpg

Examples of "initialize Forgy":

for /L %%I in (0,1,3) do %IMDEV%convert ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Means initialize Forgy kCols 5' ^
  kc_km_init_%%I.png
0.0493787
0.0515461
0.0493491
0.051236

The four versions have major differences.

kc_km_init_0.pngjpg kc_km_init_1.pngjpg kc_km_init_2.pngjpg kc_km_init_3.pngjpg
for /L %%I in (0,1,3) do %IMDEV%convert ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Fuzzy initialize Forgy kCols 5' ^
  kc_fm_init_%%I.png
0.0497924
0.0499776
0.0497327
0.050086

The four versions have small differences.

kc_fm_init_0.pngjpg kc_fm_init_1.pngjpg kc_fm_init_2.pngjpg kc_fm_init_3.pngjpg
for /L %%I in (0,1,3) do %IMDEV%convert ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Harmonic initialize Forgy kCols 5' ^
  kc_hm_init_%%I.png
0.0504368
0.0496624
0.0498175
0.0501205

The four versions have small differences.

kc_hm_init_0.pngjpg kc_hm_init_1.pngjpg kc_hm_init_2.pngjpg kc_hm_init_3.pngjpg

The default "initialise Colors" used above for the three methods gives a slightly better score than any of the "Forgy" results shown here. The "initialise Colors" needs far more CPU cycles to obtain.

Examples of "initialize RandPart":

for /L %%I in (0,1,3) do %IMDEV%convert ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Means initialize RandPart kCols 5' ^
  kc_km_initrp_%%I.png
0.0492879
0.0492781
0.0493798
0.0493491

The four versions have very small differences.

kc_km_initrp_0.pngjpg kc_km_initrp_1.pngjpg kc_km_initrp_2.pngjpg kc_km_initrp_3.pngjpg
for /L %%I in (0,1,3) do %IMDEV%convert ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Fuzzy initialize RandPart kCols 5' ^
  kc_fm_initrp_%%I.png
0.0497141
0.0496468
0.0497909
0.0496004

The four versions have small differences.

kc_fm_initrp_0.pngjpg kc_fm_initrp_1.pngjpg kc_fm_initrp_2.pngjpg kc_fm_initrp_3.pngjpg
for /L %%I in (0,1,3) do %IMDEV%convert ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Harmonic initialize RandPart kCols 5' ^
  kc_hm_initrp_%%I.png
0.0498024
0.0496748
0.0503784
0.0499867

The four versions have small differences.

kc_hm_initrp_0.pngjpg kc_hm_initrp_1.pngjpg kc_hm_initrp_2.pngjpg kc_hm_initrp_3.pngjpg

"method Means initialize RandPart" gives a consistent result that is slightly better than "-colors 5".

"method Fuzzy initialize RandPart" is fairly consistent.

"method Harmonic initialize RandPart" is less consistent.

"initialize SdLine" gives a good result for "method Means", but not so good for the other methods.

Conclusion: when using "method Means", also specify "initialize RandPart" or "initialize SdLine". (But we shouldn't really draw conclusions from a single example.)

Examples of "initialize FromList":

%IMDEV%convert ^
  %SRC% ^
  ( xc:Red xc:Lime xc:Blue ^
    xc:Cyan xc:Magenta ^
    +append ^
  ) ^
  -process ^
    'kcluster m Means i FromList' ^
  kc_km_lst.png
0.049266
kc_km_lst.pngjpg
%IMDEV%convert ^
  %SRC% ^
  ( xc:Red xc:Lime xc:Blue ^
    xc:Cyan xc:Magenta ^
    +append ^
  ) ^
  -process ^
    'kcluster m Fuzzy i FromList' ^
  kc_fz_lst.png
0.0498286
kc_fz_lst.pngjpg
%IMDEV%convert ^
  %SRC% ^
  ( xc:Red xc:Lime xc:Blue ^
    xc:Cyan xc:Magenta ^
    +append ^
  ) ^
  -process ^
    'kcluster m Harmonic i FromList' ^
  kc_hm_lst.png
0.0497165
kc_hm_lst.pngjpg

More usefully, we can use the result of one clustering as the initial guess for another. See Large images below.

"initialize FromList" enables a variety of other initializations:

%IMDEV%convert ^
  %SRC% ^
  ( +clone ^
    -resize "5x1^!" ^
  ) ^
  -process ^
    'kcluster m Means i FromList' ^
  kc_km_lst2.png
0.0493078
kc_km_lst2.pngjpg

Animation

The write option does some extra processing at the end of each iteration: it applies the clustering to a copy of the image, and saves this to a file. Then, outside the IM command, we can make an animated GIF.

(Logically, the process module might simply make a list of the images in the usual way. But this can cause memory problems.)

del kc_frame_*.png

%IMDEV%convert ^
  %SRC% ^
  -seed 1234 ^
  -process 'kcluster initialize Forgy t 0.001 write kc_frame_%%06d.png v' ^
  kc_frm.png 
kcluster options:   method Means  kCols 10  initialize Forgy  tolerance 0.001  maxIter 100  write kc_frame_%06d.png  verbose
kcluster: Input image [toes.png] 267x233 depth 16; nPixels 62211  k=10
kcluster: 10 uniq_img
kcluster: 40 iterations

This has created files kc_frame_000000.png, kc_frame_000001.png and so on. We make them into a GIF:

%IM%convert ^
  -loop 0 -delay 10 ^
  kc_frame_*.png ^
  -layers optimize ^
  kc_frame.gif
kc_frame.gif

"initialize RandPart" starts with all colours close to the image mean, so this make an effective animation:

del kc_frame_*.png

%IMDEV%convert ^
  %SRC% ^
  -seed 1234 ^
  -process 'kcluster initialize RandPart t 0.001 write kc_frame_%%06d.png' ^
  kc_frm2.png

%IM%convert ^
  -loop 0 -delay 10 ^
  kc_frame_*.png ^
  -layers optimize ^
  kc_frame2.gif
kc_frame2.gif

We don't need to keep the frame files:

del kc_frame_*.png

Jump: finding k

This uses the "jump" method to find the optimum k, the number of colours. See:

Also interesting:

The module creates clusters for some values of k, and calculates the RMSE distortion of each. (The RMSE distortion accounts for alpha, eg transparent pixels are ignored.) The distortion varies from 0.0 when the cluster exactly matches the input, up to 1.0. As k increases the distortion decreases and would eventually reach zero, when k is equal to the number of unique colours.

So minimizing the absolute distortion is not a useful goal. Instead, we measure how much distortion decreases when we increase k by one. The biggest decrease -- the biggest "jump" -- gives the best value for k.

To calculate the jump, we subtract distortions after raising them to a negative power -Y.

If dk is the distortion between the input and the result with k clusters, then:

Jk = dk-Y - d(k-1)-Y

... where d0-Y=0 and Y = p/2, where p is the number of dimensions. For colour images, the colour cube has three axes, so p=3. For grayscale, p=1. However, ordinary colour photographs have correlation between the dimensions, so may have fewer effective dimensions.

The method is very simple and works well. However, it does need the clustering to be done. We can't give it an image and ask: how many clusters is this? We need to feed it the clustered results, and it will choose the best. So this has a performance cost. On the other hand, we can apply the method to any clustering tool, even "-colors" or "-posterize", even with dithering. Cool.

If we don't like the results at the found k, we can search for another, starting at k+1.

Note that the "jump" method is good at finding the most appropriate clustering. If the pixel colours don't naturally fall into a number of groups, it correctly reports "k=1". Here is a clear example of three clusters, which "jump" finds:

Input image Clustered result
%IM%convert ^
  -seed 1234 ^
  -size 100x100 ^
  xc:rgb(10%%,15%%,20%%) ^
  xc:rgb(60%%,50%%,40%%) ^
  xc:rgb(80%%,85%%,90%%) ^
  +append -blur 0x10 ^
  -attenuate 30% ^
  +noise uniform ^
  kc_nse.png
kc_nse.pngjpg
set ARGS=i SdLine f stdout v

%IMDEV%convert ^
  kc_nse.png ^
  -process ^
    'kcluster k 1-10 %ARGS%' ^
  kc_nse_out.png |findstr maxJ 
kcluster: maxJ=8.12483  at k=3
kc_nse_out.pngjpg

In the next examples, there is no natural clustering into separate groups:

Input image Clustered result
%IM%convert ^
  -size 100x300 ^
  gradient: ^
  -rotate 90 ^
  kc_nse2.png
kc_nse2.pngjpg
%IMDEV%convert ^
  kc_nse2.png ^
  -process ^
    'kcluster k 1-10 %ARGS%' ^
  kc_nse2_out.png |findstr maxJ 
kcluster: maxJ=21.4953  at k=7
kc_nse2_out.pngjpg
%IM%convert ^
  -seed 1234 ^
  -size 300x100 ^
  xc: ^
  +noise random ^
  kc_nse3.png
kc_nse3.pngjpg
%IMDEV%convert ^
  kc_nse3.png ^
  -process ^
    'kcluster k 1-10 %ARGS%' ^
  kc_nse3_out.png |findstr maxJ 
kcluster: maxJ=6.45685  at k=1
kc_nse3_out.pngjpg
%IM%convert ^
  %SRC% ^
  -colorspace Gray ^
  kc_gray.png
kc_gray.pngjpg
%IMDEV%convert ^
  kc_gray.png ^
  -process ^
    'kcluster k 1-10 %ARGS%' ^
  kc_gray_out.png |findstr maxJ 
kcluster: maxJ=52.7245  at k=4
kc_gray_out.pngjpg

A solution of k=1 may be unacceptable for aesthetic or other reasons. A simple remedy is to start the "jump" search at 2:

Clustered result
%IMDEV%convert ^
  kc_nse2.png ^
  -process ^
    'kcluster k 2-10 %ARGS%' ^
  kc_nse2t_out.png |findstr maxJ 
kcluster: maxJ=21.4953  at k=7
kc_nse2t_out.pngjpg
%IMDEV%convert ^
  kc_nse3.png ^
  -process ^
    'kcluster k 2-10 %ARGS%' ^
  kc_nse3t_out.png |findstr maxJ 
kcluster: maxJ=3.04754  at k=4
kc_nse3t_out.pngjpg
%IMDEV%convert ^
  kc_gray.png ^
  -process ^
    'kcluster k 2-10 %ARGS%' ^
  kc_grayt_out.png |findstr maxJ 
kcluster: maxJ=52.7245  at k=4
kc_grayt_out.pngjpg

Method "null"

This initialises the clusters, then doesn't attempt to improve them. This can answer the question: "What is the best number for -colors?"

%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster k 1-10 i colors m Null f stdout v' ^
  kc_mnull.png |findstr maxJ 
kcluster: maxJ=32.6449  at k=6
kc_mnull.pngjpg

We can also use the "null" method to prepare an initial list of colours for some other purpose:

%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster k 5 i Colors m Null' ^
  -unique-colors ^
  kc_mnull2.png

call %PICTBAT%blockPix kc_mnull2.png
kc_mnull2_bp.png
%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster k 5 i SdLine m Null' ^
  -unique-colors ^
  kc_mnull3.png

call %PICTBAT%blockPix kc_mnull3.png
kc_mnull3_bp.png
%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster k 5 i Forgy m Null' ^
  -unique-colors ^
  kc_mnull4.png

call %PICTBAT%blockPix kc_mnull4.png
kc_mnull4_bp.png
%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster k 5 i RandPart m Null' ^
  -unique-colors ^
  kc_mnull5.png

call %PICTBAT%blockPix kc_mnull5.png
kc_mnull5_bp.png

Although the "null" method doesn't attempt to improve the initial clustering, it does assign each pixel to the appropriate cluster. Hence, some colours may have no pixels assigned to them, so the output may not contain all the input colours. In any case, the order may change.

%IMDEV%convert ^
  %SRC% ^
  ( xc:#012 xc:#123 xc:#234 xc:#345 xc:#456 xc:#567 ^
    +append ^
  ) ^
  -process 'kcluster k 5 i FromList m Null' ^
  -unique-colors ^
  kc_mnull6.png

call %PICTBAT%blockPix kc_mnull6.png
kc_mnull6_bp.png

Dithering

[With soft membership, we can easily get random dithering.]

Dithering increases distortion.

%IM%convert ^
  %SRC% ^
  ( +dither +clone -colors 2 ^
    +write kc_dith1.png ^
  ) ^
  -metric RMSE ^
  -format %%[distortion] ^
  -compare ^
  info: 
0.0960118
kc_dith1.pngjpg
%IM%convert ^
  %SRC% ^
  ( +clone -colors 2 ^
    +write kc_dith2.png ^
  ) ^
  -metric RMSE ^
  -format %%[distortion] ^
  -compare ^
  info: 
0.10747
kc_dith2.pngjpg

When we reduce colours without dithering, each pixel is changed to the closest (in some sense) member of a set of colours. Dithering changes the value of some of the pixels which increases the distortion at that pixel and hence overall, even though it improves the mean colour over the entire image, or even over small portions.

If we blur slightly, and get the RMSE of these blurred pixels, the numbers more closely measure our human experience.

%IM%convert ^
  %SRC% ^
  ( +dither +clone -colors 2 ^
    -blur 0x1 ^
    +write kc_dith3.png ^
  ) ^
  -metric RMSE ^
  -format %%[distortion] ^
  -compare ^
  info: 
0.0898402
kc_dith3.pngjpg
%IM%convert ^
  %SRC% ^
  ( +clone -colors 2 ^
    -blur 0x1 ^
    +write kc_dith4.png ^
  ) ^
  -metric RMSE ^
  -format %%[distortion] ^
  -compare ^
  info: 
0.084308
kc_dith4.pngjpg

Currently, dithering is implemented only for the fuzzy method.

Transparency

Alpha values (opacity) are always copied unchanged from the input to the output.

By default, the alpha channel takes no further part in the process.

When the regardAlpha option is used, there are two effects:

  1. During the cluster initialization, the module tries not to choose any fully-transparent colours.
  2. Fully-transparent pixels take no part in cluster calculation. 75% opaque pixels take 75% of a part in the calculation (they have less weight than a fully-opaque pixel), and so on.

For example:

%IMDEV%convert ^
  %SRC_HOL% ^
  -process 'kcluster k 3 regardAlpha' ^
  kc_tr1.png
kc_tr1.png

The clustering is not influenced by any fully-transparent pixels. We can composite this over the ordinary image:

%IMDEV%convert ^
  %SRC% ^
  kc_tr1.png ^
  -compose Over -composite ^
  kc_tr1_ov.png
kc_tr1_ov.pngjpg

Performance

The time required is proportional to the number of pixels in the image multiplied by the number of colours, multiplied by the number of iterations (lower tolerance = more iterations = more time).

Clustering a web-size image with ten clusters and close tolerance takes about one second.

Of these three methods, k-means is the fastest and harmonic is the slowest.

Large images

Large images take time to process.

set LGE_IMG=zp_sus_sat.tiff

%IM%identify %LGE_IMG% 
zp_sus_sat.tiff TIFF 4924x7378 4924x7378+0+0 16-bit sRGB 167.9MB 0.000u 0:00.000

We cluster it:

set TOL=tolerance 0.00005
%IMDEV%convert ^
  %LGE_IMG% ^
  -process 'kcluster kCols 10 initialize SdLine %TOL%' ^
  kc_mpd_lge1.miff
0 00:01:20

How much has clustering distorted the image?

%IMDEV%compare -metric RMSE %LGE_IMG% kc_mpd_lge1.miff NULL: 
2.08084e+08 (0.0484483)

Here is a crop from the large result:

%IM%convert ^
  kc_mpd_lge1.miff ^
  -gravity Center ^
  -crop 600x400+0+0 +repage ^
  kc_mpd_lge1_cr.png
kc_mpd_lge1_cr.pngjpg

A large image can be subsampled, and this small image is clustered, and the colours found are used to remap the large image.

%IMDEV%convert ^
  %LGE_IMG% ^
  ( +clone ^
    -resize 600x600 ^
    -process 'kcluster kCols 10 initialize SdLine %TOL%' ^
    -write mpr:MAP ^
    +delete ^
  ) ^
  +dither ^
  -remap mpr:MAP ^
  kc_mpd_lge2.miff
0 00:00:07

How close is this to the original?

%IMDEV%compare -metric RMSE %LGE_IMG% kc_mpd_lge2.miff NULL: 
2.24666e+08 (0.052309)

So, it is fast, but the distortion is slightly worse.

A crop from that:

%IM%convert ^
  kc_mpd_lge2.miff ^
  -gravity Center ^
  -crop 600x400+0+0 +repage ^
  kc_mpd_lge2_cr.png
kc_mpd_lge2_cr.pngjpg

The colours chosen are similar but not identical, and moving boundaries between pairs of colours creates a relatively large change to the image, so the results are noticably different. In the crop, the most obvious change is in the dark speckles at bottom-center.

So, we have clustered a small version, and mapped the large version to those colours.

Instead, we can cluster the small version and use those colours as the initial guess for a clustering of the large version.

%IMDEV%convert ^
  %LGE_IMG% ^
  ( +clone ^
    -resize 600x600 ^
    -process 'kcluster kCols 10 initialize SdLine %TOL%' ^
  ) ^
  -process 'kcluster initialize FromList %TOL%' ^
  kc_mpd_lge_kk.miff
0 00:00:40

How close is this to the original?

%IMDEV%compare -metric RMSE %LGE_IMG% kc_mpd_lge_kk.miff NULL: 
2.08084e+08 (0.0484483)

This is as good as directly clustering the large image, but takes half the time. How about a proper pyramid?

%IMDEV%convert ^
  %LGE_IMG% ^
  ( +clone ^
    -resize 50%% ^
    ( +clone ^
      -resize 50%% ^
      ( +clone ^
        -resize 50%% ^
        ( +clone ^
          -resize 50%% ^
          ( +clone ^
            -resize 50%% ^
            ( +clone ^
              -resize 50%% ^
              ( +clone ^
                -resize 50%% ^
                -process 'kcluster kCols 10 initialize SdLine %TOL% v' ^
              ) ^
              -process 'kcluster initialize FromList %TOL% v' ^
            ) ^
            -process 'kcluster initialize FromList %TOL% v' ^
          ) ^
          -process 'kcluster initialize FromList %TOL% v' ^
        ) ^
        -process 'kcluster initialize FromList %TOL% v' ^
      ) ^
      -process 'kcluster initialize FromList %TOL% v' ^
    ) ^
    -process 'kcluster initialize FromList %TOL% v' ^
  ) ^
  -process 'kcluster initialize FromList %TOL% v' ^
  kc_mpd_lge_pyr.miff
0 00:00:32

This is roughly the same speed. How close is this to the original?

%IMDEV%compare -metric RMSE %LGE_IMG% kc_mpd_lge_pyr.miff NULL: 
2.07881e+08 (0.0484011)

Again, it is as good as directly clustering the large image.

Conclusion: clustering by traversing a pyramid is accurate, but only saves about half the time. For real time-saving (but less accuracy), cluster a small version and map the large image to those colours.

FUTURE: The pyramid method would be faster if the module had an option of passing back the unique colours, rather than the entire image. This would save the time needed by the next level to get the unique colours.

How does it work?

The image has WxH pixels. Vi are pixel values, where i=0...W*H-1. Cj are k colours, where j=0...k-1.

Membership, m(Vi,Cj), is the proportion of Vi that belongs to Cj. For a given Vi, the sum of memberships over all Cj is 1.0. Depending on the method, membership is either "hard" (either 0 or 1), or "soft" (between 0.0 and 1.0).

Weight, w(Vi), is a property of Vi. w(Vi) > 0. Depending on the method, weight is a constant 1.0, or varies.

The overall algorithm normalises (if required), then calls the clustering algorithm once or multiple times, then does the inverse of normalising (if required).

The clustering algorithm iterates to a local optimum in four steps:

  1. Initialise: guess the k colours.
  2. For each Vi, calculate m(Vi,Cj) and w(Vi).
  3. For each Cj, re-calculate it:
          sum ( m(Vi,Cj) * w(Vi) * Vi )
    Cj' = -----------------------------
            sum ( m(Vi,Cj) * w(Vi) )
    ... where we sum over i=0...W*H-1.
  4. Repeat 2 and 3 until convergence.

Convergence is declared when every channel of every cluster is changed by a value less than or equal to the tolerance. Changes usually decrease between iterations, but sometimes increase.

For the null method, Cj is not recalculated so we have immediate convergence, and the result is whatever the initialisation created. (If we have "initialise Forgy" or "initialise RandPart", every result will be different.)

The details of membership and weight for the three methods are:

For k-means, membership is hard: m(Vi,Cj)=1 iff Cj is the closest colour to Vi. ie measuring distance as (dr^2+dg^2+db^2). Weight is a constant w(Vi)=1. The recalculation of Cj simplifies to:

      sum ( m(Vi,Cj) * Vi )
Cj' = ---------------------
        sum ( m(Vi,Cj) )

... where we sum over i=0...W*H-1.

For fuzzy k-means, membership is soft; each Vi belongs partly to all Cj. Let:

powdist1(Vi,Cj) = distance(V0,Cj)(-2/(r-1))

... where r >= 1. Then:

              powdist1(Vi,Cj)
m(Vi,Cj) = --------------------
           sum(powdist1(Vi,Cj))

... where we sum for j=0...k-1. Hence, for a given pixel Vi, the sum of its memberships is 1.0. Weight is a constant w(Vi)=1.

For k-harmonic, membership is soft, and weight varies. p is a parameter, p >= 2, eg p=3.5. We define:

powdist2(Vi,Cj) = distance(V0,Cj)(-p-2) 

powdist3(Vi,Cj) = distance(V0,Cj)(-p) 

             powdist2(Vi,Cj)
m(Vi,Cj) = --------------------
           sum(powdist2(Vi,Cj))

... where we sum for j=0...k-1. Again hence, for a given pixel Vi, the sum of its memberships is 1.0. The weight is given by:

         sum (powdist2(Vi,Cj))
w(Vi) = -----------------------
        (sum(powdist3(Vi,Cj)))2

... where again we sum for j=0...k-1.

Alpha

The algorithm as described above ignores alpha, so assumes the input image is opaque.

Alpha can be a mechanism for "turning pixels off", so they don't participate. The colour values in pixels that are entirely transparent then have no effect on the result.

When alpha represents opacity on a scale of 0.0 (entirely transparent) to 1.0 (entirely opaque), we can simply multiply either membership or weight by alpha.

When the regardAlpha option is used, there are two effects:

  1. During the cluster initialization, the module tries not to choose any fully-transparent colours.
  2. At the recalculation of Cj', the alpha value Ai (0=transparent ... 1.0=opaque) is used in both sums, so the general formula becomes:
          sum ( m(Vi,Cj) * w(Vi) * Vi * Ai )
    Cj' = ----------------------------------
            sum ( m(Vi,Cj) * w(Vi) * Ai )
    The values Cj', Cj and Vi are the three dimensions R, G and B.

Variations

Different colorspaces give different results:

%IMDEV%convert ^
  %SRC% ^
  -process 'kcluster' ^
  kc_cs1.png
kc_cs1.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -colorspace RGB ^
  -process 'kcluster' ^
  -colorspace RGB ^
  kc_cs2.png
kc_cs2.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -colorspace Lab ^
  -process 'kcluster' ^
  -colorspace RGB ^
  kc_cs3.png
kc_cs3.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -colorspace YIQ ^
  -process 'kcluster' ^
  -colorspace RGB ^
  kc_cs4.png
kc_cs4.pngjpg

I see no obvious reason for preferring one colorspace over another. The RGB version gives us more segmentation in the highlights and less in the shadows, which we might want. We can get the opposite result:

%IMDEV%convert ^
  %SRC% ^
  -evaluate Pow 0.5 ^
  -process 'kcluster' ^
  -evaluate Pow 2 ^
  kc_pow1.png
kc_pow1.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -evaluate Pow 0.25 ^
  -process 'kcluster' ^
  -evaluate Pow 4 ^
  kc_pow2.png
kc_pow2.pngjpg

We can do a similar trick with colour channels:

%IMDEV%convert ^
  %SRC% ^
  -channel R ^
  -evaluate Divide 4 ^
  +channel ^
  -process 'kcluster' ^
  -channel R ^
  -evaluate Multiply 4 ^
  +channel ^
  kc_ch1.png
kc_ch1.pngjpg
%IMDEV%convert ^
  %SRC% ^
  -channel GB ^
  -evaluate Divide 4 ^
  +channel ^
  -process 'kcluster' ^
  -channel GB ^
  -evaluate Multiply 4 ^
  +channel ^
  kc_ch2.png
kc_ch2.pngjpg

Prettifying the result

After creating a clustered result, we can prettify it with a number of methods.

A clustered result.

kc_ex1.png

kc_ex1.pngjpg

Some prettifiers:

Median filter smooths boundaries.

%IM%convert ^
  kc_ex1.png ^
  -statistic median 3x3 ^
  kc_p_med1.png
kc_p_med1.pngjpg

Larger median filter.

%IM%convert ^
  kc_ex1.png ^
  -statistic median 7x7 ^
  kc_p_med2.png
kc_p_med2.pngjpg

Remove small components.

call %PICTBAT%connCompLimArea ^
  kc_ex1.png ^
  kc_p_lim1.png ^
  0.05c
kc_p_lim1.pngjpg

Remove larger components.

call %PICTBAT%connCompLimArea ^
  kc_ex1.png ^
  kc_p_lim2.png ^
  0.5c
kc_p_lim2.pngjpg

Remove smallest components of each colour.

call %PICTBAT%lgstConnComp ^
  kc_ex1.png ^
  kc_p_lgstcc.png
kc_p_lgstcc.pngjpg

Remove larger components. blah something else

call %PICTBAT%connCompLimArea ^
  kc_ex1.png ^
  kc_p_lim2.png ^
  0.5c
kc_p_lim2.pngjpg

Remap with dither.

%IM%convert ^
  %SRC% ^
  -remap kc_ex1.png ^
  kc_p_rem1.png
kc_p_rem1.pngjpg

Remap without dither.

%IM%convert ^
  %SRC% ^
  +dither ^
  -remap kc_ex1.png ^
  kc_p_rem2.png
kc_p_rem2.pngjpg

Remap blurred with unblurred.

%IM%convert ^
  kc_ex1.png +write mpr:MAP ^
  ( +clone -blur 0x1 ) ^
  -delete 0 ^
  -remap mpr:MAP ^
  kc_p_blm1.png
kc_p_blm1.pngjpg

Remap blurred with unblurred.

%IM%convert ^
  kc_ex1.png +write mpr:MAP ^
  ( +clone -blur 0x2 ) ^
  -delete 0 ^
  -remap mpr:MAP ^
  kc_p_blm2.png
kc_p_blm2.pngjpg

Identifying edges (boundaries between clusters) may not be pretty, but can be useful:

%IM%convert ^
  kc_ex1.png ^
  -edge 1 ^
  kc_p_edge1.png
kc_p_edge1.pngjpg
%IM%convert ^
  kc_ex1.png ^
  -edge 1 ^
  -fill White +opaque Black ^
  kc_p_edge2.png
kc_p_edge2.pngjpg

The mean values of the edge images are measures of complexity, on a scale of 0.0 to 1.0.

%IM%convert ^
  kc_p_edge1.png ^
  -format %%[fx:mean] ^
  info: 
0.0572441
%IM%convert ^
  kc_p_edge2.png ^
  -format %%[fx:mean] ^
  info: 
0.331051

Making masks

If desired, we can make a mask for each colour. First, find the unique colours:

set MSK_SRC=kc_km_nsd0.png
set UNIQ_FOR_MSK=kc_mm_uniq.png

%IM%convert ^
  %MSK_SRC% ^
  -unique-colors ^
  +write txt: ^
  %UNIQ_FOR_MSK% 
# ImageMagick pixel enumeration: 5,1,65535,srgb
0,0: (17749,13222,14248)  #455533A637A8  srgb(27%,20%,22%)
1,0: (25486,26677,21314)  #638E68355342  srgb(39%,41%,33%)
2,0: (33381,30792,27863)  #826578486CD7  srgb(51%,47%,43%)
3,0: (42585,36559,35471)  #A6598ECF8A8F  srgb(65%,56%,54%)
4,0: (55326,48519,46544)  #D81EBD87B5D0  srgb(84%,74%,71%)

Then call the script mMapMasks.bat (see Gradient contours).

call %PICTBAT%mMapMasks %MSK_SRC% %UNIQ_FOR_MSK% kc_msks_.png

echo mmmNumMasks=%mmmNumMasks% 
mmmNumMasks=5 

This has created mask files kc_msks_1.png to kc_msks_5.png. These are white where the colour is; otherwise black.

kc_msks_1.png kc_msks_2.png kc_msks_3.png kc_msks_4.png kc_msks_5.png

It has also created cumulative masks kc_msks_c_2.png to kc_msks_c_5.png.

kc_msks_c_2.png kc_msks_c_3.png kc_msks_c_4.png kc_msks_c_5.png

Conclusions

The system works well. I haven't used it enough to draw firm conclusions about the favoured method and initialization.

For input images that are essentially flat-colour cartoons, with other colours from anti-aliasing or JPEG noise, the "jump" option is highly effective at identifying the clusters.

The default tolerance of 0.01 is rather generous. In real usage, 0.001 or 1e-5 might be preferred.

For precise work, the default maximum iterations of 100 may be too limiting.

When the input has transparency, the option regardAlpha should generally be used. I may make this the default.


All images on this page were created by the commands shown, using:

%IM%identify -version
Version: ImageMagick 6.9.5-3 Q16 x86 2016-07-22 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2015 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC Modules OpenMP 
Delegates (built-in): bzlib cairo flif freetype jng jp2 jpeg lcms lqr openexr pangocairo png ps rsvg tiff webp xml zlib
%IMDEV%identify -version
Version: ImageMagick 6.9.3-7 Q32 x86_64 2017-08-20 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2016 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Features: Cipher DPC HDRI Modules OpenMP 
Delegates (built-in): bzlib cairo fftw fontconfig freetype fpx jbig jng jpeg lcms ltdl lzma pangocairo png rsvg tiff webp wmf x xml zlib

To improve internet download speeds, some images may have been automatically converted (by ImageMagick, of course) from PNG to JPG.

Source file for this web page is kcluster.h1. To re-create this web page, execute "procH1 kcluster".


This page, including the images, is my copyright. Anyone is permitted to use or adapt any of the code, scripts or images for any purpose, including commercial use.

Anyone is permitted to re-publish this page, but only for non-commercial use.

Anyone is permitted to link to this page, including for commercial use.


Page version v1.0 8-August-2017.

Page created 21-Aug-2017 19:04:36.

Copyright © 2017 Alan Gibson.