K-clustering

A process module for k-means, fuzzy k-means and k-harmonic, with transparency.

References
Sample inputs
The process module
Examples
Transparency
Performance
Large images
How does it work?
Variations
Prettifying the result
Making masks
Conclusions

These are colour-reduction methods, like IM's -colors, but different.

When an image has been clustered into a small collection of colours, each pixel is set to the closest colour. The colour used for a cluster may not be a good representative of the image pixels in the cluster, so the process adjusts the colour of each cluster to be more representative, then re-clusters the image to these new pixels, and iterates until colours no longer change, within a specified tolerance.

On this page, "distance" always means the difference between two colours, as measured by three colour channels. It is not concerned with positions of pixels on the image.

References

This page and the process module implements three of the clustering methods described in:

Alternatives to the k-means algorithm that find better clusterings Greg Hamerly, Charles Elkan, 2002.

However, the paper doesn't describe how opacity should be processed.

Also relevant:

Some Methods for Classification and Analysis of Multivariate Observations J. MacQueen, 1967.
Generalized K-Harmonic Means -- Dynamic Weighting of Data in Unsupervised Learning Bin Zhang, 2000.
Colour quantisation as a preprocessing step for image segmentation H. Palus & M. Frackiewicz, 2011.
A study on clustering for colour image quantisation H. Palus & M. Frackiewicz, 2013.

See also references in the section Jump: finding k below.

Sample inputs

set SRC=toes.png
set SRC_HOL=toes_holed.png

The process module

Option		Description
Short form	Long form	Description
`m string`	`method string`	Method for converging clusters, one of: `Means` "k-means" clustering; `Fuzzy` "fuzzy k-means" clustering; `Harmonic` "k-harmonic" clustering; `Null` no convergence; Default = `means`.
`k N` `k N-M`	`kCols N` `kCols N-M`	Number of clusters to find. Parameter is one integer, or two integers separated by a hyphen. When two integers, finds k as the best value between N and M inclusive by the jump method. For example: `kCols 10`, `kCols 5-15`. Default = `10`.
`j N,M`	`jump N,M`	DEPRECATED. Find k as the best value between N and M inclusive by the jump method. For example: `jump 1,10`
`s`	`sdNorm`	Transform input to equalise channel SDs, and invert this at the end.
`a`	`regardAlpha`	Process opacity.
`i string`	`initialize string`	Initialization for clusters, one of: `Colors` IM's "`-colors`"; `Forgy` chooses random pixels from the image; `RandPart` assigns image pixels to random partitions; `SdLine` uses colours from a short line in the colour cube; `FromList` uses unique colours from last image in list; Default = `colors`.
`y N`	`jumpY N`	Y power parameter for "jump" iterations. A positive number. Default = `0.5`, `1.0` or `1.5`.
`r N`	`fuzzyR N`	Power parameter r for fuzzy method. A positive number at least 1.0. Default = `1.5`.
`p N`	`harmonicP N`	Power parameter p for harmonic method. A positive number at least 2.0. Default = `3.5`.
`t N`	`tolerance N`	Stop iterating when worst channel change is less than or equal to this. Default = `0.01`.
`x N`	`maxIter N`	Stop after this many iterations. Default = `100`.
`w filename`	`write filename`	Write iterations as image files. For example: `write myfile_%06d.png`
`f string`	`file string`	Write verbose text to `stderr` or `stdout`. Default: `stderr`.
`v`	`verbose`	Write some text output to stderr.
`v2`	`verbose2`	Write more text output to stderr.
`d`	`debug`	Write debugging text output to stderr.

(Case is not significant.)

The module initializes to a set of kCols colours, and uses a method to iteratively improve the colours.

Instead of specifying an exact number of colours to find, we can find the number that gives the biggest jump in quality. The module will choose the best number for this image within that range.

sdNorm is an option to normalise the standard deviation of channels, so each dimension has the same influence.

Iteration continues until a tolerance is achieved, or when maxIter iterations have been performed. tolerance can be set to zero but arithmetical rounding may prevent a perfect result from being achieved.

Examples

The number shown beneath each image is the RMSE distortion from the input image, on a scale of 0.0 to 1.0, where 0.0 means "no change".

Default parameters:

%IM7DEV%magick ^
  %SRC% ^
  -process kcluster ^
  kc_ex1.png

0.0383549

Default method ("means"), varying kCols, with "-colors" shown for comparison:

-process 'kcluster' -colors

`-process 'kcluster'`	`-colors`
%IM7DEV%magick ^ %SRC% ^ -process 'kcluster kCols 3' ^ kc_ex2.png 0.0725123		%IM7DEV%magick ^ %SRC% ^ +dither ^ -colors 3 ^ kc_exc2.png 0.0780149
%IM7DEV%magick ^ %SRC% ^ -process 'kcluster kCols 4' ^ kc_ex4.png 0.0575264		%IM7DEV%magick ^ %SRC% ^ +dither ^ -colors 4 ^ kc_exc4.png 0.0673186
%IM7DEV%magick ^ %SRC% ^ -process 'kcluster kCols 5' ^ kc_ex5.png 0.0493609		%IM7DEV%magick ^ %SRC% ^ +dither ^ -colors 5 ^ kc_exc5.png 0.0656154

%IM7DEV%magick ^
  %SRC% ^
  -process 'kcluster kCols 3' ^
  kc_ex2.png

0.0725123

%IM7DEV%magick ^
  %SRC% ^
  +dither ^
  -colors 3 ^
  kc_exc2.png

0.0780149

%IM7DEV%magick ^
  %SRC% ^
  -process 'kcluster kCols 4' ^
  kc_ex4.png

0.0575264

%IM7DEV%magick ^
  %SRC% ^
  +dither ^
  -colors 4 ^
  kc_exc4.png

0.0673186

%IM7DEV%magick ^
  %SRC% ^
  -process 'kcluster kCols 5' ^
  kc_ex5.png

0.0493609

%IM7DEV%magick ^
  %SRC% ^
  +dither ^
  -colors 5 ^
  kc_exc5.png

0.0656154

For the same number of colours, the process module with default settings gives a better score than "-colors".

The other two methods: fuzzy and harmonic.

%IM7DEV%magick ^
  %SRC% ^
  -process 'kcluster method Fuzzy' ^
  kc_fz1.png

0.0391296

%IM7DEV%magick ^
  %SRC% ^
  -process 'kcluster method Harmonic' ^
  kc_hm1.png

0.0382404

Normalise SD

Not normalised	Normalised
%IM7DEV%magick ^ %SRC% ^ -process ^ 'kcluster m Means k 5' ^ kc_km_nsd0.png 0.0493609		%IM7DEV%magick ^ %SRC% ^ -process ^ 'kcluster m Means sdNorm k 5' ^ kc_km_nsd1.png 0.0514201
%IM7DEV%magick ^ %SRC% ^ -process ^ 'kcluster m Fuzzy k 5' ^ kc_fm_nsd0.png 0.0496455		%IM7DEV%magick ^ %SRC% ^ -process ^ 'kcluster m Fuzzy sdNorm k 5' ^ kc_fm_nsd1.png 0.0516036
%IM7DEV%magick ^ %SRC% ^ -process ^ 'kcluster m Harmonic k 5' ^ kc_hm_nsd0.png 0.0497404		%IM7DEV%magick ^ %SRC% ^ -process ^ 'kcluster m Harmonic sdNorm k 5' ^ kc_hm_nsd1.png 0.0509971

Not normalised

Normalised

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Means k 5' ^
  kc_km_nsd0.png

0.0493609

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Means sdNorm k 5' ^
  kc_km_nsd1.png

0.0514201

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Fuzzy k 5' ^
  kc_fm_nsd0.png

0.0496455

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Fuzzy sdNorm k 5' ^
  kc_fm_nsd1.png

0.0516036

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Harmonic k 5' ^
  kc_hm_nsd0.png

0.0497404

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Harmonic sdNorm k 5' ^
  kc_hm_nsd1.png

0.0509971

The normalised versions show more detail in the skin tones, and less in the grass. Numerically, the normalised versions are slightly less accurate.

The normalisation process implemented here is crude: it calculates the standard deviations of the three colour channels, and divides values in two channels in the appropiate ratio. (With appropriate processing when one, two or three channels have SD=0.)

A more sophisticated method may be more useful.

Initialization

The methods all start from an initial guess at the colours, and refine this by iteration. But they iterate towards a local optimum, not a global optimum, so the result will depend on the first guess.

The default "initialize Colors" uses the equivalent of the CLI -colors k. This isn't guaranteed to find as many colours as requested. When it doesn't, the module will try to fill the slots with random pixels from the image.

"initialize SdLine" uses k colours that lie on a straight line within the colour cube. The coordinates of the ends of this line are (mnR-sdR,mnG-sdG,mnB-sdB) and (mnR+sdR,mnG+sdG,mnB+sdB), where mnR is the mean of the red channel, sdR is the standard deviation of the red channel, and so on. Mean and SD calculations always take alpha into account, eg transparent pixels are ignored. The line is divided into k segments, and the mid-points of those segments are used for the k colours.

"initialize FromList" uses the unique colours from the last image in the current list. The number of unique colours gives k, so kCols has no relevance, and is ignored. Hence the list must contain at least two images. All the images in the list except the last will be clustered.

"initialize Forgy" picks the k colours from k pixels chosen at random from the image, so we get a different initialisation each time (unless we use the same seed).

"initialize RandPart" is the "random partition" method. This assigns each image pixel to a random cluster, and calculates the cluster colour as the mean of its pixels, weighted by alpha. This initialises the clusters to colours that are close to the image mean. Again, we get a different initialisation each time (unless we use the same seed).

Examples of "initialize Colors":

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Means i Colors k 5' ^
  kc_km_col.png

0.0493609

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Fuzzy i Colors k 5' ^
  kc_fz_col.png

0.0496455

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Harmonic i Colors k 5' ^
  kc_hm_col.png

0.0497404

Examples of "initialize SdLine":

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Means i SdLine k 5' ^
  kc_km_sdl.png

0.0492722

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Fuzzy i SdLine k 5' ^
  kc_fz_sdl.png

0.0497469

%IM7DEV%magick ^
  %SRC% ^
  -process ^
    'kcluster m Harmonic i SdLine k 5' ^
  kc_hm_sdl.png

0.0502853

Examples of "initialize Forgy":

for /L %%I in (0,1,3) do %IM7DEV%magick ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Means initialize Forgy kCols 5' ^
  kc_km_init_%%I.png

The four versions have major differences.

for /L %%I in (0,1,3) do %IM7DEV%magick ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Fuzzy initialize Forgy kCols 5' ^
  kc_fm_init_%%I.png

The four versions have small differences.

for /L %%I in (0,1,3) do %IM7DEV%magick ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Harmonic initialize Forgy kCols 5' ^
  kc_hm_init_%%I.png

The four versions have small differences.

The default "initialise Colors" used above for the three methods gives a slightly better score than any of the "Forgy" results shown here. The "initialise Colors" needs far more CPU cycles to obtain.

Examples of "initialize RandPart":

for /L %%I in (0,1,3) do %IM7DEV%magick ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Means initialize RandPart kCols 5' ^
  kc_km_initrp_%%I.png

The four versions have very small differences.

for /L %%I in (0,1,3) do %IM7DEV%magick ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Fuzzy initialize RandPart kCols 5' ^
  kc_fm_initrp_%%I.png

The four versions have small differences.

for /L %%I in (0,1,3) do %IM7DEV%magick ^
  %SRC% ^
  -seed %%I ^
  -process ^
    'kcluster method Harmonic initialize RandPart kCols 5' ^
  kc_hm_initrp_%%I.png

The four versions have small differences.

"method Means initialize RandPart" gives a consistent result that is slightly better than "-colors 5".

"method Fuzzy initialize RandPart" is fairly consistent.

"method Harmonic initialize RandPart" is less consistent.

"initialize SdLine" gives a good result for "method Means", but not so good for the other methods.

Conclusion: when using "method Means", also specify "initialize RandPart" or "initialize SdLine". (But we shouldn't really draw conclusions from a single example.)

Examples of "initialize FromList":

%IM7DEV%magick ^
  %SRC% ^
  ( xc:Red xc:Lime xc:Blue ^
    xc:Cyan xc:Magenta ^
    +append +repage ^
  ) ^
  -process ^
    'kcluster m Means i FromList' ^
  kc_km_lst.png

0.049266

%IM7DEV%magick ^
  %SRC% ^
  ( xc:Red xc:Lime xc:Blue ^
    xc:Cyan xc:Magenta ^
    +append +repage ^
  ) ^
  -process ^
    'kcluster m Fuzzy i FromList' ^
  kc_fz_lst.png

0.0498286

%IM7DEV%magick ^
  %SRC% ^
  ( xc:Red xc:Lime xc:Blue ^
    xc:Cyan xc:Magenta ^
    +append +repage ^
  ) ^
  -process ^
    'kcluster m Harmonic i FromList' ^
  kc_hm_lst.png

0.0497165

More usefully, we can use the result of one clustering as the initial guess for another. See Large images below.

"initialize FromList" enables a variety of other initializations:

%IM7DEV%magick ^
  %SRC% ^
  ( +clone ^
    -resize "5x1^!" ^
  ) ^
  -process ^
    'kcluster m Means i FromList' ^
  kc_km_lst2.png

0.0493078

Animation

The write option does some extra processing at the end of each iteration: it applies the clustering to a copy of the image, and saves this to a file. Then, outside the IM command, we can make an animated GIF.

(Logically, the process module might simply make a list of the images in the usual way. But this can cause memory problems.)

del kc_frame_*.png

%IM7DEV%magick ^
  %SRC% ^
  -seed 1234 ^
  -process 'kcluster initialize Forgy t 0.001 write kc_frame_%%06d.png v' ^
  kc_frm.png

kcluster options:   method Means  kCols 10  initialize Forgy  tolerance 0.001  maxIter 100  write kc_frame_6d.png  verbose
kcluster: Input image [toes.png] 267x233 depth 16; nPixels 62211  k=10
kcluster: 10 uniq_img
kcluster: 21 iterations

This has created files kc_frame_000000.png, kc_frame_000001.png and so on. We make them into a GIF:

%IMG7%magick ^
  -loop 0 -delay 10 ^
  kc_frame_*.png ^
  -layers optimize ^
  kc_frame.gif

"initialize RandPart" starts with all colours close to the image mean, so this make an effective animation:

del kc_frame_*.png

%IM7DEV%magick ^
  %SRC% ^
  -seed 1234 ^
  -process 'kcluster initialize RandPart t 0.001 write kc_frame_%%06d.png' ^
  kc_frm2.png

%IMG7%magick ^
  -loop 0 -delay 10 ^
  kc_frame_*.png ^
  -layers optimize ^
  kc_frame2.gif

We don't need to keep the frame files:

del kc_frame_*.png

Jump: finding k

This uses the "jump" method to find the optimum k, the number of colours. See:

Wikipedia: Determining the number of clusters in a data set: An information-theoretic approach
Finding the number of clusters in a data set: An information theoretic approach Catherine A Sugar, Gareth M James, 2003

Also interesting:

Fast Algorithm for Finding True Number of Clusters. Applications to Control Systems Mikhail Morozkov, Oleg Granichin, Zeev Volkovich and Xuming Zhang, 2012.

The module creates clusters for some values of k, and calculates the RMSE distortion of each. (The RMSE distortion accounts for alpha, eg transparent pixels are ignored.) The distortion varies from 0.0 when the cluster exactly matches the input, up to 1.0. As k increases the distortion decreases and would eventually reach zero, when k is equal to the number of unique colours.

So minimizing the absolute distortion is not a useful goal. Instead, we measure how much distortion decreases when we increase k by one. The biggest decrease -- the biggest "jump" -- gives the best value for k.

To calculate the jump, we subtract distortions after raising them to a negative power -Y.

If d_k is the distortion between the input and the result with k clusters, then:

J_k = d_k^-Y - d_(k-1)^-Y

... where d₀^-Y=0 and Y = p/2, where p is the number of dimensions. For colour images, the colour cube has three axes, so p=3. For grayscale, p=1. However, ordinary colour photographs have correlation between the dimensions, so may have fewer effective dimensions.

The method is very simple and works well. However, it does need the clustering to be done. We can't give it an image and ask: how many clusters is this? We need to feed it the clustered results, and it will choose the best. So this has a performance cost. On the other hand, we can apply the method to any clustering tool, even "-colors" or "-posterize", even with dithering. Cool.

If we don't like the results at the found k, we can search for another, starting at k+1.

Note that the "jump" method is good at finding the most appropriate clustering. If the pixel colours don't naturally fall into a number of groups, it correctly reports "k=1". Here is a clear example of three clusters, which "jump" finds:

Input image	Clustered result
%IMG7%magick ^ -seed 1234 ^ -size 100x100 ^ xc:rgb(10%%,15%%,20%%) ^ xc:rgb(60%%,50%%,40%%) ^ xc:rgb(80%%,85%%,90%%) ^ +append +repage -blur 0x10 ^ -attenuate 30% ^ +noise uniform ^ kc_nse.png		set ARGS=i SdLine f stdout v %IM7DEV%magick ^ kc_nse.png ^ -process ^ 'kcluster k 1-10 %ARGS%' ^ kc_nse_out.png \|findstr maxJ kcluster: maxJ=8.0257423 at k=3

Input image

Clustered result

%IMG7%magick ^
  -seed 1234 ^
  -size 100x100 ^
  xc:rgb(10%%,15%%,20%%) ^
  xc:rgb(60%%,50%%,40%%) ^
  xc:rgb(80%%,85%%,90%%) ^
  +append +repage -blur 0x10 ^
  -attenuate 30% ^
  +noise uniform ^
  kc_nse.png

set ARGS=i SdLine f stdout v

%IM7DEV%magick ^
  kc_nse.png ^
  -process ^
    'kcluster k 1-10 %ARGS%' ^
  kc_nse_out.png |findstr maxJ

kcluster: maxJ=8.0257423  at k=3

In the next examples, there is no natural clustering into separate groups:

Input image	Clustered result
%IMG7%magick ^ -size 100x300 ^ gradient: ^ -rotate 90 ^ kc_nse2.png		%IM7DEV%magick ^ kc_nse2.png ^ -process ^ 'kcluster k 1-10 %ARGS%' ^ kc_nse2_out.png \|findstr maxJ kcluster: maxJ=21.495349 at k=7
%IMG7%magick ^ -seed 1234 ^ -size 300x100 ^ xc: ^ +noise random ^ kc_nse3.png		%IM7DEV%magick ^ kc_nse3.png ^ -process ^ 'kcluster k 1-10 %ARGS%' ^ kc_nse3_out.png \|findstr maxJ kcluster: maxJ=6.4446498 at k=1
%IMG7%magick ^ %SRC% ^ -colorspace Gray ^ kc_gray.png		%IM7DEV%magick ^ kc_gray.png ^ -process ^ 'kcluster k 1-10 %ARGS%' ^ kc_gray_out.png \|findstr maxJ kcluster: maxJ=52.724475 at k=4

Input image

Clustered result

%IMG7%magick ^
  -size 100x300 ^
  gradient: ^
  -rotate 90 ^
  kc_nse2.png

%IM7DEV%magick ^
  kc_nse2.png ^
  -process ^
    'kcluster k 1-10 %ARGS%' ^
  kc_nse2_out.png |findstr maxJ

kcluster: maxJ=21.495349  at k=7

%IMG7%magick ^
  -seed 1234 ^
  -size 300x100 ^
  xc: ^
  +noise random ^
  kc_nse3.png

%IM7DEV%magick ^
  kc_nse3.png ^
  -process ^
    'kcluster k 1-10 %ARGS%' ^
  kc_nse3_out.png |findstr maxJ

kcluster: maxJ=6.4446498  at k=1

%IMG7%magick ^
  %SRC% ^
  -colorspace Gray ^
  kc_gray.png

%IM7DEV%magick ^
  kc_gray.png ^
  -process ^
    'kcluster k 1-10 %ARGS%' ^
  kc_gray_out.png |findstr maxJ

kcluster: maxJ=52.724475  at k=4

A solution of k=1 may be unacceptable for aesthetic or other reasons. A simple remedy is to start the "jump" search at 2:

Clustered result
%IM7DEV%magick ^ kc_nse2.png ^ -process ^ 'kcluster k 2-10 %ARGS%' ^ kc_nse2t_out.png \|findstr maxJ kcluster: maxJ=21.495349 at k=7
%IM7DEV%magick ^ kc_nse3.png ^ -process ^ 'kcluster k 2-10 %ARGS%' ^ kc_nse3t_out.png \|findstr maxJ kcluster: maxJ=3.0559299 at k=4
%IM7DEV%magick ^ kc_gray.png ^ -process ^ 'kcluster k 2-10 %ARGS%' ^ kc_grayt_out.png \|findstr maxJ kcluster: maxJ=52.724475 at k=4

Clustered result

%IM7DEV%magick ^
  kc_nse2.png ^
  -process ^
    'kcluster k 2-10 %ARGS%' ^
  kc_nse2t_out.png |findstr maxJ

kcluster: maxJ=21.495349  at k=7

%IM7DEV%magick ^
  kc_nse3.png ^
  -process ^
    'kcluster k 2-10 %ARGS%' ^
  kc_nse3t_out.png |findstr maxJ

kcluster: maxJ=3.0559299  at k=4

%IM7DEV%magick ^
  kc_gray.png ^
  -process ^
    'kcluster k 2-10 %ARGS%' ^
  kc_grayt_out.png |findstr maxJ

kcluster: maxJ=52.724475  at k=4

Method "null"

This initialises the clusters, then doesn't attempt to improve them. This can answer the question: "What is the best number for -colors?"

%IM7DEV%magick ^
  %SRC% ^
  -process 'kcluster k 1-10 i colors m Null f stdout v' ^
  kc_mnull.png |findstr maxJ

kcluster: maxJ=32.644907  at k=6

We can also use the "null" method to prepare an initial list of colours for some other purpose:

%IM7DEV%magick ^ %SRC% ^ -process 'kcluster k 5 i Colors m Null' ^ -unique-colors ^ kc_mnull2.png call %PICTBAT%blockPix kc_mnull2.png
%IM7DEV%magick ^ %SRC% ^ -process 'kcluster k 5 i SdLine m Null' ^ -unique-colors ^ kc_mnull3.png call %PICTBAT%blockPix kc_mnull3.png
%IM7DEV%magick ^ %SRC% ^ -process 'kcluster k 5 i Forgy m Null' ^ -unique-colors ^ kc_mnull4.png call %PICTBAT%blockPix kc_mnull4.png
%IM7DEV%magick ^ %SRC% ^ -process 'kcluster k 5 i RandPart m Null' ^ -unique-colors ^ kc_mnull5.png call %PICTBAT%blockPix kc_mnull5.png

Although the "null" method doesn't attempt to improve the initial clustering, it does assign each pixel to the appropriate cluster. Hence, some colours may have no pixels assigned to them, so the output may not contain all the input colours. In any case, the order may change.

%IM7DEV%magick ^
  %SRC% ^
  ( xc:#012 xc:#123 xc:#234 xc:#345 xc:#456 xc:#567 ^
    +append +repage ^
  ) ^
  -process 'kcluster k 5 i FromList m Null' ^
  -unique-colors ^
  kc_mnull6.png

call %PICTBAT%blockPix kc_mnull6.png

Dithering

[With soft membership, we can easily get random dithering.]

Dithering increases distortion.

%IMG7%magick ^
  %SRC% ^
  ( +dither +clone -colors 2 ^
    +write kc_dith1.png ^
  ) ^
  -metric RMSE ^
  -format %%[distortion] ^
  -compare ^
  info:

0.0960119

%IMG7%magick ^
  %SRC% ^
  ( +clone -colors 2 ^
    +write kc_dith2.png ^
  ) ^
  -metric RMSE ^
  -format %%[distortion] ^
  -compare ^
  info:

0.0971351

When we reduce colours without dithering, each pixel is changed to the closest (in some sense) member of a set of colours. Dithering changes the value of some of the pixels which increases the distortion at that pixel and hence overall, even though it improves the mean colour over the entire image, or even over small portions.

If we blur slightly, and get the RMSE of these blurred pixels, the numbers more closely measure our human experience.

%IMG7%magick ^
  %SRC% ^
  ( +dither +clone -colors 2 ^
    -blur 0x1 ^
    +write kc_dith3.png ^
  ) ^
  -metric RMSE ^
  -format %%[distortion] ^
  -compare ^
  info:

0.0898399

%IMG7%magick ^
  %SRC% ^
  ( +clone -colors 2 ^
    -blur 0x1 ^
    +write kc_dith4.png ^
  ) ^
  -metric RMSE ^
  -format %%[distortion] ^
  -compare ^
  info:

0.0880604

Currently, dithering is implemented only for the fuzzy method.

Transparency

Alpha values (opacity) are always copied unchanged from the input to the output.

By default, the alpha channel takes no further part in the process.

When the regardAlpha option is used, there are two effects:

During the cluster initialization, the module tries not to choose any fully-transparent colours.
Fully-transparent pixels take no part in cluster calculation. 75% opaque pixels take 75% of a part in the calculation (they have less weight than a fully-opaque pixel), and so on.

For example:

%IM7DEV%magick ^
  %SRC_HOL% ^
  -process 'kcluster k 3 regardAlpha' ^
  kc_tr1.png

The clustering is not influenced by any fully-transparent pixels. We can composite this over the ordinary image:

%IM7DEV%magick ^
  %SRC% ^
  kc_tr1.png ^
  -compose Over -composite ^
  kc_tr1_ov.png

Performance

The time required is proportional to the number of pixels in the image multiplied by the number of colours, multiplied by the number of iterations (lower tolerance = more iterations = more time).

Clustering a web-size image with ten clusters and close tolerance takes about one second.

Of these three methods, k-means is the fastest and harmonic is the slowest.

Large images

Large images take time to process.

set LGE_IMG=zp_sus_sat.tiff

%IMG7%magick identify %LGE_IMG%

zp_sus_sat.tiff TIFF 4924x7378 4924x7378+0+0 16-bit sRGB 207.88MiB 0.000u 0:00.000

We cluster it:

set TOL=tolerance 0.00005
%IM7DEV%magick ^
  %LGE_IMG% ^
  -process 'kcluster kCols 10 initialize SdLine %TOL%' ^
  kc_mpd_lge1.miff

0 00:07:02

How much has clustering distorted the image?

%IM7DEV%magick compare -metric RMSE %LGE_IMG% kc_mpd_lge1.miff NULL:

2.0829694e+08 (0.04849791)

Here is a crop from the large result:

%IMG7%magick ^
  kc_mpd_lge1.miff ^
  -gravity Center ^
  -crop 600x400+0+0 +repage ^
  kc_mpd_lge1_cr.png

A large image can be subsampled, and this small image is clustered, and the colours found are used to remap the large image.

%IM7DEV%magick ^
  %LGE_IMG% ^
  ( +clone ^
    -resize 600x600 ^
    -process 'kcluster kCols 10 initialize SdLine %TOL%' ^
    -write mpr:MAP ^
    +delete ^
  ) ^
  +dither ^
  -remap mpr:MAP ^
  kc_mpd_lge2.miff

0 00:00:13

How close is this to the original?

%IM7DEV%magick compare -metric RMSE %LGE_IMG% kc_mpd_lge2.miff NULL:

2.2483111e+08 (0.052347571)

So, it is fast, but the distortion is slightly worse.

A crop from that:

%IMG7%magick ^
  kc_mpd_lge2.miff ^
  -gravity Center ^
  -crop 600x400+0+0 +repage ^
  kc_mpd_lge2_cr.png

The colours chosen are similar but not identical, and moving boundaries between pairs of colours creates a relatively large change to the image, so the results are noticably different. In the crop, the most obvious change is in the dark speckles at bottom-center.

So, we have clustered a small version, and mapped the large version to those colours.

Instead, we can cluster the small version and use those colours as the initial guess for a clustering of the large version.

%IM7DEV%magick ^
  %LGE_IMG% ^
  ( +clone ^
    -resize 600x600 ^
    -process 'kcluster kCols 10 initialize SdLine %TOL%' ^
  ) ^
  -process 'kcluster initialize FromList %TOL%' ^
  kc_mpd_lge_kk.miff

0 00:03:43

How close is this to the original?

%IM7DEV%magick compare -metric RMSE %LGE_IMG% kc_mpd_lge_kk.miff NULL:

2.0829711e+08 (0.04849795)

This is as good as directly clustering the large image, but takes half the time. How about a proper pyramid?

%IM7DEV%magick ^
  %LGE_IMG% ^
  ( +clone ^
    -resize 50%% ^
    ( +clone ^
      -resize 50%% ^
      ( +clone ^
        -resize 50%% ^
        ( +clone ^
          -resize 50%% ^
          ( +clone ^
            -resize 50%% ^
            ( +clone ^
              -resize 50%% ^
              ( +clone ^
                -resize 50%% ^
                -process 'kcluster kCols 10 initialize SdLine %TOL% v' ^
              ) ^
              -process 'kcluster initialize FromList %TOL% v' ^
            ) ^
            -process 'kcluster initialize FromList %TOL% v' ^
          ) ^
          -process 'kcluster initialize FromList %TOL% v' ^
        ) ^
        -process 'kcluster initialize FromList %TOL% v' ^
      ) ^
      -process 'kcluster initialize FromList %TOL% v' ^
    ) ^
    -process 'kcluster initialize FromList %TOL% v' ^
  ) ^
  -process 'kcluster initialize FromList %TOL% v' ^
  kc_mpd_lge_pyr.miff

0 00:01:16

This is roughly the same speed. How close is this to the original?

%IM7DEV%magick compare -metric RMSE %LGE_IMG% kc_mpd_lge_pyr.miff NULL:

2.0809177e+08 (0.048450141)

Again, it is as good as directly clustering the large image.

Conclusion: clustering by traversing a pyramid is accurate, but only saves about half the time. For real time-saving (but less accuracy), cluster a small version and map the large image to those colours.

FUTURE: The pyramid method would be faster if the module had an option of passing back the unique colours, rather than the entire image. This would save the time needed by the next level to get the unique colours.

How does it work?

The image has WxH pixels. Vi are pixel values, where i=0...W*H-1. Cj are k colours, where j=0...k-1.

Membership, m(Vi,Cj), is the proportion of Vi that belongs to Cj. For a given Vi, the sum of memberships over all Cj is 1.0. Depending on the method, membership is either "hard" (either 0 or 1), or "soft" (between 0.0 and 1.0).

Weight, w(Vi), is a property of Vi. w(Vi) > 0. Depending on the method, weight is a constant 1.0, or varies.

The overall algorithm normalises (if required), then calls the clustering algorithm once or multiple times, then does the inverse of normalising (if required).

The clustering algorithm iterates to a local optimum in four steps:

Initialise: guess the k colours.
For each Vi, calculate m(Vi,Cj) and w(Vi).

For each Cj, re-calculate it:

      sum ( m(Vi,Cj) * w(Vi) * Vi )
Cj' = -----------------------------
        sum ( m(Vi,Cj) * w(Vi) )

... where we sum over i=0...W*H-1.

Repeat 2 and 3 until convergence.

Convergence is declared when every channel of every cluster is changed by a value less than or equal to the tolerance. Changes usually decrease between iterations, but sometimes increase.

For the null method, Cj is not recalculated so we have immediate convergence, and the result is whatever the initialisation created. (If we have "initialise Forgy" or "initialise RandPart", every result will be different.)

The details of membership and weight for the three methods are:

For k-means, membership is hard: m(Vi,Cj)=1 iff Cj is the closest colour to Vi. ie measuring distance as (dr^2+dg^2+db^2). Weight is a constant w(Vi)=1. The recalculation of Cj simplifies to:

      sum ( m(Vi,Cj) * Vi )
Cj' = ---------------------
        sum ( m(Vi,Cj) )

... where we sum over i=0...W*H-1.

For fuzzy k-means, membership is soft; each Vi belongs partly to all Cj. Let:

powdist1(Vi,Cj) = distance(V0,Cj)^(-2/(r-1))

... where r >= 1. Then:

              powdist1(Vi,Cj)
m(Vi,Cj) = --------------------
           sum(powdist1(Vi,Cj))

... where we sum for j=0...k-1. Hence, for a given pixel Vi, the sum of its memberships is 1.0. Weight is a constant w(Vi)=1.

For k-harmonic, membership is soft, and weight varies. p is a parameter, p >= 2, eg p=3.5. We define:

powdist2(Vi,Cj) = distance(V0,Cj)^(-p-2) 

powdist3(Vi,Cj) = distance(V0,Cj)^(-p) 

             powdist2(Vi,Cj)
m(Vi,Cj) = --------------------
           sum(powdist2(Vi,Cj))

... where we sum for j=0...k-1. Again hence, for a given pixel Vi, the sum of its memberships is 1.0. The weight is given by:

         sum (powdist2(Vi,Cj))
w(Vi) = -----------------------
        (sum(powdist3(Vi,Cj)))²

... where again we sum for j=0...k-1.

Alpha

The algorithm as described above ignores alpha, so assumes the input image is opaque.

Alpha can be a mechanism for "turning pixels off", so they don't participate. The colour values in pixels that are entirely transparent then have no effect on the result.

When alpha represents opacity on a scale of 0.0 (entirely transparent) to 1.0 (entirely opaque), we can simply multiply either membership or weight by alpha.

When the regardAlpha option is used, there are two effects:

During the cluster initialization, the module tries not to choose any fully-transparent colours.
At the recalculation of Cj', the alpha value Ai (0=transparent ... 1.0=opaque) is used in both sums, so the general formula becomes:
```
      sum ( m(Vi,Cj) * w(Vi) * Vi * Ai )
Cj' = ----------------------------------
        sum ( m(Vi,Cj) * w(Vi) * Ai )
```
The values Cj', Cj and Vi are the three dimensions R, G and B.

Variations

Different colorspaces give different results:

%IM7DEV%magick ^
  %SRC% ^
  -process 'kcluster' ^
  kc_cs1.png

%IM7DEV%magick ^
  %SRC% ^
  -colorspace RGB ^
  -process 'kcluster' ^
  -colorspace RGB ^
  kc_cs2.png

%IM7DEV%magick ^
  %SRC% ^
  -colorspace Lab ^
  -process 'kcluster' ^
  -colorspace RGB ^
  kc_cs3.png

%IM7DEV%magick ^
  %SRC% ^
  -colorspace YIQ ^
  -process 'kcluster' ^
  -colorspace RGB ^
  kc_cs4.png

I see no obvious reason for preferring one colorspace over another. The RGB version gives us more segmentation in the highlights and less in the shadows, which we might want. We can get the opposite result:

%IM7DEV%magick ^
  %SRC% ^
  -evaluate Pow 0.5 ^
  -process 'kcluster' ^
  -evaluate Pow 2 ^
  kc_pow1.png

%IM7DEV%magick ^
  %SRC% ^
  -evaluate Pow 0.25 ^
  -process 'kcluster' ^
  -evaluate Pow 4 ^
  kc_pow2.png

We can do a similar trick with colour channels:

%IM7DEV%magick ^
  %SRC% ^
  -channel R ^
  -evaluate Divide 4 ^
  +channel ^
  -process 'kcluster' ^
  -channel R ^
  -evaluate Multiply 4 ^
  +channel ^
  kc_ch1.png

%IM7DEV%magick ^
  %SRC% ^
  -channel GB ^
  -evaluate Divide 4 ^
  +channel ^
  -process 'kcluster' ^
  -channel GB ^
  -evaluate Multiply 4 ^
  +channel ^
  kc_ch2.png

Prettifying the result

After creating a clustered result, we can prettify it with a number of methods.

A clustered result.

kc_ex1.png

Some prettifiers:

Median filter smooths boundaries. %IMG7%magick ^ kc_ex1.png ^ -statistic median 3x3 ^ kc_p_med1.png		Larger median filter. %IMG7%magick ^ kc_ex1.png ^ -statistic median 7x7 ^ kc_p_med2.png
Remove small components. call %PICTBAT%connCompLimArea ^ kc_ex1.png ^ kc_p_lim1.png ^ 0.05c		Remove larger components. call %PICTBAT%connCompLimArea ^ kc_ex1.png ^ kc_p_lim2.png ^ 0.5c
Remove smallest components of each colour. call %PICTBAT%lgstConnComp ^ kc_ex1.png ^ kc_p_lgstcc.png		Remove larger components. blah something else call %PICTBAT%connCompLimArea ^ kc_ex1.png ^ kc_p_lim2.png ^ 0.5c
Remap with dither. %IMG7%magick ^ %SRC% ^ -remap kc_ex1.png ^ kc_p_rem1.png		Remap without dither. %IMG7%magick ^ %SRC% ^ +dither ^ -remap kc_ex1.png ^ kc_p_rem2.png
Remap blurred with unblurred. %IMG7%magick ^ kc_ex1.png +write mpr:MAP ^ ( +clone -blur 0x1 ) ^ -delete 0 ^ -remap mpr:MAP ^ kc_p_blm1.png		Remap blurred with unblurred. %IMG7%magick ^ kc_ex1.png +write mpr:MAP ^ ( +clone -blur 0x2 ) ^ -delete 0 ^ -remap mpr:MAP ^ kc_p_blm2.png

Identifying edges (boundaries between clusters) may not be pretty, but can be useful:

%IMG7%magick ^
  kc_ex1.png ^
  -edge 1 ^
  kc_p_edge1.png

%IMG7%magick ^
  kc_ex1.png ^
  -edge 1 ^
  -fill White +opaque Black ^
  kc_p_edge2.png

The mean values of the edge images are measures of complexity, on a scale of 0.0 to 1.0.

%IMG7%magick ^
  kc_p_edge1.png ^
  -format %%[fx:mean] ^
  info:

0.0572441

%IMG7%magick ^
  kc_p_edge2.png ^
  -format %%[fx:mean] ^
  info:

0.595811

Making masks

If desired, we can make a mask for each colour. First, find the unique colours:

set MSK_SRC=kc_km_nsd0.png
set UNIQ_FOR_MSK=kc_mm_uniq.png

%IMG7%magick ^
  %MSK_SRC% ^
  -unique-colors ^
  +write txt: ^
  %UNIQ_FOR_MSK%

# ImageMagick pixel enumeration: 5,1,0,65535,srgb
0,0: (17749,13222,14248)  #455533A637A8  srgb(27.0832%,20.1755%,21.7411%)
1,0: (25486,26677,21314)  #638E68355342  srgb(38.8891%,40.7065%,32.5231%)
2,0: (33381,30792,27863)  #826578486CD7  srgb(50.9361%,46.9856%,42.5162%)
3,0: (42585,36559,35471)  #A6598ECF8A8F  srgb(64.9805%,55.7855%,54.1253%)
4,0: (55326,48519,46544)  #D81EBD87B5D0  srgb(84.4221%,74.0352%,71.0216%)

Then call the script mMapMasks.bat (see Gradient contours).

call %PICTBAT%mMapMasks %MSK_SRC% %UNIQ_FOR_MSK% kc_msks_.png

echo mmmNumMasks=%mmmNumMasks%

mmmNumMasks=5

This has created mask files kc_msks_1.png to kc_msks_5.png. These are white where the colour is; otherwise black.

It has also created cumulative masks kc_msks_c_2.png to kc_msks_c_5.png.

Conclusions

The system works well. I haven't used it enough to draw firm conclusions about the favoured method and initialization.

For input images that are essentially flat-colour cartoons, with other colours from anti-aliasing or JPEG noise, the "jump" option is highly effective at identifying the clusters.

The default tolerance of 0.01 is rather generous. In real usage, 0.001 or 1e-5 might be preferred.

For precise work, the default maximum iterations of 100 may be too limiting.

When the input has transparency, the option regardAlpha should generally be used. I may make this the default.

All images on this page were created by the commands shown, using:

%IMG7%magick -version

Version: ImageMagick 7.1.1-15 Q16-HDRI x64 a0a5f3d:20230730 https://imagemagick.org
Copyright: (C) 1999 ImageMagick Studio LLC
License: https://imagemagick.org/script/license.php
Features: Cipher DPC HDRI OpenCL OpenMP(2.0) 
Delegates (built-in): bzlib cairo freetype gslib heic jng jp2 jpeg jxl lcms lqr lzma openexr pangocairo png ps raqm raw rsvg tiff webp xml zip zlib
Compiler: Visual Studio 2022 (193532217)

%IM7DEV%magick -version

Version: ImageMagick 7.1.1-13 (Beta) Q32-HDRI x86_64 a8de149e1:20230703 https://imagemagick.org
Copyright: (C) 1999 ImageMagick Studio LLC
License: https://imagemagick.org/script/license.php
Features: Cipher DPC HDRI Modules OpenCL OpenMP(4.5) 
Delegates (built-in): bzlib cairo fftw fontconfig freetype heic jbig jng jpeg lcms ltdl lzma pangocairo png raqm raw rsvg tiff webp wmf x xml zip zlib
Compiler: gcc (11.3)

To improve internet download speeds, some images may have been automatically converted (by ImageMagick, of course) from PNG or TIFF or MIFF to JPG.

Source file for this web page is kcluster.h1. To re-create this web page, execute "procH1 kcluster".

This page, including the images, is my copyright. Anyone is permitted to use or adapt any of the code, scripts or images for any purpose, including commercial use.

Anyone is permitted to re-publish this page, but only for non-commercial use.

Anyone is permitted to link to this page, including for commercial use.

Page version v1.0 8-August-2017.

Page created 29-Sep-2023 01:22:24.