The languages I use as examples
I only show languages where I was able to find an example of UDHR Article 1.
European languages using Latin alphabets.
In January 2025 I started to use a LuaLaTeX template for listing which languages
are supported, and dropped several languages where I was not wholly sure how
they ought to appear.
If they are present, European Latin alphabets are, or were, shown for the following
purposes:
- Azeri : schwa as well as dotted/dotless i and g with breve, s with
cedilla
(and yes, some geographers consider that Azerbaijan is in Europe).
For revisions after 2025-01-26 I have dropped Azeri.
- Catalan : precomposed l with middle dot, separate middle dot (the
precomposed versions are uncommon and deprecated, although often
much more readable, so I show Catalan even if those are missing).
- Czech : carons, particularly on e, n, r.
- Danish : ae, o with stroke, a with ring.
- Dutch : the ij digraph (again, uncommon).
- French : the usual accents, c-cedilla, ae ligature, oe ligature also n-tilde and
y-diaeresis (uppercase Y-diaeresis is very uncommon so I will accept its omission).
- German : umlauts on a,o,u and sharp-s (ß).
- Hungarian : double acutes on o and u.
- Icelandic : eth, thorn, ae.
- Italian : a few more accents.
- Latvian : cedillas on other letters, macrons.
- Lithuanian : dotted e, i and u with ogonek.
- Maltese : c and g with dot above, h-stroke.
- Northern Sami : d and t with stroke, eng.
- Polish : ogoneks on a,e,o, acute on c,s,z, l-stroke, dotted z.
- Portuguese : tilde on a,o.
- Romanian : a with breve, s and t with comma below,
also the incorrect s and t with circumflex.
- Serbo-Croat : d-stroke, digraphs for d-z-caron, lj, nj (the digraphs are uncommon
so again I show the alphabet even where they are not present).
- Slovenian : carons on (only) c,s,z - only shown if a font cannot do Czech but stilli
has these letters with carons.
- Spanish : n tilde (also mentioned in French, but I overlooked
that).
- Turkish : dotted/dotless i, g with breve, s with cedilla.
For revisions before 2025-01-26 this was only shown if Azeri was not supported.
For revisions after that date it is now always shown if supported.
- Welsh : accents and diaeresis on w and y.
Other languages using Latin alphabets
Non-European languages using variations of Latin alphabets (various African alphabets,
also Vietnamese) are covered separately at the end of the PDF languages files for those
fonts which support them. For revisions after 2025-01-26 all of these except Vietnamese
have been dropped.
Cyrillic alphabets
The Cyrillic alphabets are, or were, used for the following purposes:
- Abkhazian : ghe, ka, pe, te, xa with descender (or ghe, pe with old middle hook),
ka with stroke, abkhazian chei, lowercase schwa. When a font does not include the
ge and pe versions with descender, I considered that the font was not suitable -
enough time has elapsed. The Article 1 text I copied included the old middle hook,
but while checking for updates of UDHR translations I updated that character.
Dropped in revisions after 2025-01-26.
- Adyghe : palochka. Dropped in revisions after 2025-01-26.
- Kazakh : straight u, ghe with stroke,, barred o, straight u with stroke, en with
descender. The Article 1 text I pasted has non-breaking hyphens, in some fonts
those are not present. I have replaced them with dashes or hyphen-minus.
Dropped in revisions after 2025-01-26.
- Macedonian : gje, kje (ghe and ka with acute), lje, nje and ie and i with
grave.
- Serbo-Croat : dje, tshe, lje, nje.
- Tatar : schwa, en and zhe with descenders. Dropped in revisions after 2025-01-26.
- Ukrainian : ghe with upturn, ukrainian ie, i, yi. I also use the Ukrainian alphabet
to show the forms of Cyrillic italics if italics are present.
- Yakut : ghe with middle hook, en ghe ligature, straight u. Dropped in revisions
after 2025-01-26.
My naming of languages and scripts.
I use 'Serbo-Croat' to cover Bosnian, Croatian, Montenegran, Serbian.
For other writing systems I followed the Unicode naming, but for revisions in 2024
have renamed 'Canadian Aboriginal' to 'First Nations'.
Other comments on my languages PDFs
Hyphenation: See My history file for the
gory details.
TLDR - before 2024 I just accepted what came up.
When I started out, it seemed a good idea to minimize the size of files, to save space
and to reduce upload/downlaod times. Therefore I used 10pt as my standard text size.
That is not a problem when looking at a PDF (just zoom in for a larger view), but I will
make two comments about this:
- 10pt is probably a bit smaller than what webpages use as a default size. Some
fonts have multiple sizes and 10pt might use the smallest, so the shapes in larger
sizes might be a lot better.
- I had expected that using a standard size should make the text in each file similar
sizes. That might be true for capital letters, but the size of lowercase letters varies
greatly - in particular, some old-style serif fonts have tiny lowercase letters.