ImageMagick can rasterise UTF-8 characters outside the ASCII range 0-127.
Procedures described here were tested under Windows 8.1 and Windows 11. The principles may be correct for other versions of Windows.
This page uses text encoded as UTF-8. If text is encoded differently, it may be converted to UTF-8 using "recode", a utility available for Unix, and Cygwin for Windows, and probably anything else.
See also my Pango page.
You may need to tell your command window that you will be working with UTF-8.
chcp 65001
I will use "šŋĩβģő élève äëïöü" as a test string. It is meaningless but looks quite good. My web browser correctly displays these characters as:
If your browser shows something different, you may need to change something.
You may be able to type these characters with Alt plus something. See external page How to enter Unicode characters in Microsoft Windows.
Type the following command, or copy and paste it from this web page:
echo šŋĩβģő élève äëïöü>snibu8.txt
The file contains 33 bytes. Use dir snibu8.txt to verify this. There are 20 characters including the final carriage-return and line-feed, of which 7 occupy 1 byte and 13 occupy 2 bytes.
Windows cmd echo always appends a carriage-return and line-feed. To avoid these:
set /p="šŋĩβģő élève äëïöü"<nul >snibu8noCr.txt dir snibu8noCr.txt
If you edit snibu8.txt, Notepad will recognise that the file is encoded in UTF-8 and will show the correct characters. Sadly, Wordpad won't. If you type snibu8.txt it should display correctly on the console.
If you edit the file in Notepad, it will insert three extra bytes at the start of the file: 0xEF, 0xBB and 0xBF, the "byte order mark" (BOM). IM seems to ignore the BOM, or perhaps the font has no glyph. If you want to remove BOMs, see Removing BOM from a file below.
We can create text in an image either by directly including the text in the command, or indirectly using a text file. Your chosen font needs to include UTF-8 glyphs. On my current computer, the default font contains all the glyphs in my test string.
Annotate%IMG7%magick ^ -size 400x200 xc:khaki -gravity Center ^ -pointsize 30 ^ -annotate 0 "šŋĩβģő élève äëïöü" ^ u8_an.png |
|
Caption%IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ caption:"šŋĩβģő élève äëïöü" ^ u8_ca.png |
|
Label%IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ label:"šŋĩβģő élève äëïöü" ^ u8_la.png |
|
Draw%IMG7%magick ^ -size 400x200 xc:khaki -gravity Center ^ -pointsize 30 ^ -draw "text 0,0 'šŋĩβģő élève äëïöü'" ^ u8_dr.png |
|
Pango%IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ -pointsize 30 ^ pango:"šŋĩβģő élève äëïöü" ^ u8_pa.png |
The terminal cr-lf in the file influences the graphical output.
Annotate%IMG7%magick ^ -size 400x200 xc:khaki -gravity Center ^ -pointsize 30 ^ -annotate 0 @snibu8.txt ^ u8_an_t.png |
|
Caption%IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ caption:@snibu8.txt ^ u8_ca_t.png |
|
Label%IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ label:@snibu8.txt ^ u8_la_t.png |
|
Drawecho text 0,0 'šŋĩβģő élève äëïöü'>u8_dr_t.txt %IMG7%magick ^ -size 400x200 xc:khaki -gravity Center ^ -pointsize 30 ^ -draw @u8_dr_t.txt ^ u8_dr_t.png |
|
Pango%IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ -pointsize 30 ^ pango:@snibu8.txt ^ u8_pa_t.png |
|
SVGcall %PICTBAT%setInkPath %IMG7%magick ^ -density 72 ^ snutf8.svg ^ u8_sv_t.png |
The default font and size for pango is different from the other methods. In addition, "gravity center" doesn't centralise vertically.
snutf8.svg is:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <svg xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:cc="http://creativecommons.org/ns#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:svg="http://www.w3.org/2000/svg" xmlns="http://www.w3.org/2000/svg" width="400px" height="200px" id="svg2" version="1.1"> <g id="layer1"> <text xml:space="preserve" style="fill:#000000;fill-opacity:1;stroke:none" x="100" y="100" id="text2816"><tspan id="tspan2818" >šŋĩβģő élève äëïöü</tspan></text> </g> </svg>
Arabic is written from right to left. Joined-up writing uses different glyphs from separated writing, even when printed or on computer screens.
According to Cambridge Dictionaries Online, the Arabic for "image" is "صَوْرة", and for "magic" is "سِحْر". So we have "صَوْرة سِحْر". Re-writing that phrase, with characters spaced so each is standalone, like "i m a g e m a g i c", looks like this: "صَ وْ ر ة سِ حْ ر".
With default rendering, each character is written separately, and in the wrong order (left to right):
Annotate%IMG7%magick ^ -size 400x200 xc:khaki -gravity Center ^ -pointsize 30 ^ -annotate 0 "صَوْرة سِحْر" ^ u8_an_a.png |
|
Caption%IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ caption:"صَوْرة سِحْر" ^ u8_ca_a.png |
|
Label%IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ label:"صَوْرة سِحْر" ^ u8_la_a.png |
|
Draw%IMG7%magick ^ -size 400x200 xc:khaki -gravity Center ^ -pointsize 30 ^ -draw "text 0,0 'صَوْرة سِحْر'" ^ u8_dr_a.png |
|
Pango%IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ -pointsize 30 ^ pango:"صَوْرة سِحْر" ^ u8_pa_a.png Pango's default font doesn't include Arabic glyphs.
|
The IM setting "-direction" should be useful. In a right-to-left world, gravity is confused, and the automatic pointsize for caption and label doesn't work.
Annotate%IMG7%magick ^ -size 400x200 xc:khaki -gravity Center ^ -pointsize 30 -direction right-to-left ^ -annotate 0 "صَوْرة سِحْر" ^ u8_an_a2.png The direction is incorrect. |
|
Caption%IMG7%magick ^ -size 400x200 -gravity NorthWest ^ -background khaki ^ -pointsize 30 -direction right-to-left ^ caption:"صَوْرة سِحْر" ^ u8_ca_a2.png The direction is correct but the characters are standalone. |
|
Label%IMG7%magick ^ -size 400x200 -gravity NorthWest ^ -background khaki ^ -pointsize 30 -direction right-to-left ^ label:"صَوْرة سِحْر" ^ u8_la_a2.png The direction is correct but the characters are standalone. |
|
Draw%IMG7%magick ^ -size 400x200 xc:khaki -gravity Center ^ -pointsize 30 -direction right-to-left ^ -draw "text 0,0 'صَوْرة سِحْر'" ^ u8_dr_a2.png The direction is incorrect. |
|
PangoThis needs a font to be specified. %IMG7%magick ^ -size 400x200 -gravity Center ^ -background khaki ^ -font Arial -pointsize 30 ^ pango:"صَوْرة سِحْر" ^ u8_pa_a2.png The direction and glyphs are correct. |
|
PangoTesting wordwrap. %IMG7%magick ^ -size 200x200 -gravity Center ^ -background khaki ^ -font Arial -pointsize 50 ^ pango:"صَوْرة سِحْر" ^ u8_pa_a2ww.png This seems correct. |
Conclusion: for Arabic text, "pango:" is the obvious choice.
Some more examples of Pango, in English and Arabic:
Wordwrap. %IMG7%magick ^ -size 350x200 -gravity NorthEast ^ -background khaki ^ -font Arial -pointsize 30 ^ pango:"wonderful powerful no-cost image magic" ^ u8_pa_ex1.png |
|
Wordwrap, with a forced new line. %IMG7%magick ^ -size 350x200 -gravity NorthEast ^ -background khaki ^ -font Arial -pointsize 30 ^ pango:"wonderful powerful no-cost\nimage magic" ^ u8_pa_ex2.png |
|
Wordwrap. %IMG7%magick ^ -size 350x200 -gravity NorthWest ^ -background khaki ^ -font Arial -pointsize 30 ^ pango:"رائع قَوي مَجّاني صَوْرة سِحْر" ^ u8_pa_ex3.png |
|
Wordwrap, with a forced new line. %IMG7%magick ^ -size 350x200 -gravity West ^ -background khaki ^ -font Arial -pointsize 30 ^ pango:"رائع قَوي مَجّاني\nصَوْرة سِحْر" ^ u8_pa_ex4.png |
Here is a simple BAT script that strips any BOMs in a file. Open a UTF-8 file in Notepad. Delete all the contents. Save the file as "bom.txt", checking that the encoding is UTF-8. Check with dir bom.txt that the file has three bytes. Move bom.txt to the same directory as the script deBom.txt.
The script will read a file, strip any BOMs, and write to stdout. Call it like this:
call deBom infile.txt >outfile.txt
The script deBom.bat is:
@rem Removes any Byte Order Marks (BOM) in a file. @setlocal enabledelayedexpansion @set BOMFILE=%~dp0bom.txt @if not exist %BOMFILE% @( @echo Can't find %BOMFILE% @exit /B 1 ) @for /F %%a in (%BOMFILE%) do @set bom=%%a @for /F "tokens=*" %%a in (%1) do @( @set line=%%a @set line=!line:%bom%=! @echo !line! ) @exit /B 0
All images on this page were created by the commands shown, using:
%IMG7%magick -version
Version: ImageMagick 7.1.1-15 Q16-HDRI x64 a0a5f3d:20230730 https://imagemagick.org Copyright: (C) 1999 ImageMagick Studio LLC License: https://imagemagick.org/script/license.php Features: Cipher DPC HDRI OpenCL OpenMP(2.0) Delegates (built-in): bzlib cairo freetype gslib heic jng jp2 jpeg jxl lcms lqr lzma openexr pangocairo png ps raqm raw rsvg tiff webp xml zip zlib Compiler: Visual Studio 2022 (193532217)
Source file for this web page is snutf8.h1, which is encoded in UTF-8. To re-create this web page, execute "procH1 snutf8".
This page, including the images, is my copyright. Anyone is permitted to use or adapt any of the code, scripts or images for any purpose, including commercial use.
Anyone is permitted to re-publish this page, but only for non-commercial use.
Anyone is permitted to link to this page, including for commercial use.
Page version v1.1 1-Dec-2014.
Page created 29-Sep-2023 08:14:46.
Copyright © 2023 Alan Gibson.