So finally we have v2! Thanks to the site owners and for all contributing users.
I just noticed that oneyoudontknow has proposed this problem and give a list of pages with deformed letters. That list can be become huge if we inspect the Russian bands. Anyway, I want to give a slightly more systematic look at this issue. Hope the mod won't regard my article as a duplicate.
=====================================
Edit: I think making of list of pages with messed up letters is helpful, but currently I don't have time to look up each band from Eastern Europe and Eastern Asia, say. So I will not touch the task. Also, users can report messed up letters.
Also, I suggest to the mods to open the right of editing album titles to metal demons, i.e. users with more than 10,000 points. (Currently there are only 62 of them.) So I don't have to report messed up album titles to the mod. I don't know if opening this right could be potentially dangerous to the site. If this is inapplicable, I suggest that we recruit some users who have specific knowledge about a language and let them deal with reports on this issue.
======================================
New Edit (2011/11/15): The following is a (highly incomplete) list of albums with mojibakes in tracklists, titles and lyrics. This list will keep growing (big or small) as I compile more such releases and as users repair the mojibakes. If I add the tag "(lyrics)" after the release, then it means that lyrics is among the things deformed.
If you try to repair an album, make sure that you repair everything, including the lyrics; if you cannot repair the lyrics, please do not modify it.An easy way to repair mojibakes: go to v1 of that page (you can find the v1 link at the bottom of each page), change the encoding to the corresponding language, then chances are, the correct text come back. This method is useful, but does not guarantee satisfactory result in every case! Example: For deformed Polish texts, choose the encoding Central European (Windows), and see if the correct texts return.
Please confirm your results (e.g. by googling) before pasting them here.NOTE:
you do not need to know the specific language (of course it's better if you do), but you need to know how it looks like. For example, in Polish, there is no such letter "ê", it shows up very often because it is deformed from the letter "ę". Similarly, the strange thing ³ is deformed from ł (l slash).
http://www.metal-archives.com/albums/Ar ... ange/47712 (lyrics)
http://www.metal-archives.com/albums/Cl ... onie/39369 (lyrics)
http://www.metal-archives.com/albums/Cr ... 93ci/14883 (lyrics)
http://www.metal-archives.com/albums/De ... mes/110960 (lyrics)
http://www.metal-archives.com/albums/Dr ... Angel/8674http://www.metal-archives.com/albums/He ... veto/25059http://www.metal-archives.com/albums/He ... ream/31742 (the lyrics is NOT deformed)
http://www.metal-archives.com/albums/He ... ony/102494http://www.metal-archives.com/albums/He ... ania/74685======================================
A particular annoying thing in v1 is the deformation of non-Latin characters into unreadable mojibakes (see below). As a consequence, a lot of valuable information about the band name, album title, tracklist, lyrics is thus lost. I have been spending a lot effort for years to repair the deformed characters on the pages of Chinese/Japanese bands. And it is
heart-breaking to see them deformed again after some time. The tracklists of small/obscure bands are really difficult to find, and I often need to refer to my own CD collection. But now I am abroad and my CDs are not with me, so once some tracklist got deformed, I may never be able to repair it. Since the site serves as an encyclopedia, the key information in original language is valuable and helpful to the native speakers, and since v2 uses UFT-8 encoding, the letter deformation will not be a problem anymore, so let us repair the mojibakes that already existed from v1 and add key information in original language.
The following is my little guideline. Any suggestions, opinions are welcome.
What are mojibakes?
Mojibakes are deformed letters that occur when the default encoding system of a website is incompatible with the input language. For example, if I type Рассвет (in Cyrillic) in v1 metal archives, it would likely be deformed into Ðàññâåò, which is totally unreadable. This phenomenon is widespread in band pages that contain non-Latin characters, e.g. German, Finnish, Czech, Russian, Japanese, Chinese, etc.
UTF-8 solutionv2 uses UTF-8 as its encoding system, which incorporates every character in the Unicode character set. So this should produce no more mojibakes.
But the mojibakes that already existed in v1 will not automatically be fixed in v2. So I suggest that if the users see any mojibakes, report them or fixed them if he has the power.
HTML entities:Some words were written in HTML entities, and could represent normally in v1. For example, Kurazh has a song called “Дождь” and was written as Дождь When you read the page, everything is represented fine. But if you search the song “Дождь”, it will return nothing. The user can use the edit tool to see if the letters are written in HTML or just letters themselves.
Note:
Please never switch the encoding system away from the default UTF-8. I guess if you try to edit a page under another encoding system, it will still cause trouble.
What is key information?
IMO, the band name, label name, band members’ names, album title, tracklist, lyrics are key information.
When should the original language accompany the key information?
The only key information that should come with original language is the official thing. For example, Seikima-II is officially known as 聖飢魔 II in Japan, Tang Dynasty is officially (and only) known as 唐朝 in China. So we should add the original band names. Ritual Day has two official names (look at the logo), one in English, and one in Chinese (施教日); the Chinese name should also be added. On the other hand, Dark Mirror ov Tragedy from Korea has no Korean name, so there is no need to add any Korean translation.
Some bands give the tracklist in more than one language. The tracklist in each language should be recorded, and the one in “Main language” should appear in the tracklist in Metallum and the other may go to additional notes (the user should also mark that the translation is official), or both tracklists appear in tracklist in Metallum. However, if the band only gives one tracklist, then the original one should go to tracklist in Metallum. The user may give an English translation, or Romanization, if the original one is not English, but the translation should go to the additional notes. In the past, to avoid mojibakes, some information is translated/transliterated, and the original one is ignored.
Example: Forest (Rus) – In the Flame of Glory. The information is provided by the band in both Russian and English (look at the cover). Its Russian title is “В Пламени Славы”. Some song titles are also in Russian. So the Russian version should appear as main language and English info should also be added and marked as official.
Example: Tang Dynasty (唐朝) - A Dream Return to Tang Dynasty. As far as I know, this album never had an official English translation; it is only known as "梦回唐朝". So I stubbornly think that the title should at least be "梦回唐朝 (A Dream Return to Tang Dynasty)" in the discography, or the English translation should go to the additional notes.
Agree?
Special languagesSerbian. This is the only European language that is in active digraphia; it can be written in both Cyrillic and Latin Script. What script should be provided depends on the band. Example, Dažd - Naživo!, on this album, everything is written in Cyrillic, so the title is “Наживо!”, and the tracklist on the back cover is also in Cyrillic script. So we should provide the information in Cyrillic, and may add the corresponding Latin script to additional notes.
Chinese. There are two ways to write it, Traditional Chinese and Simplified Chinese. The former is widely used in Mainland China, and the later is standard in Taiwan. Some bands from Mainland write the tracklist in traditional way. But IMO, the writings are interchangeable so it suffices to provide the info in one way.
Translation or Transliteration?
This is a tricky problem. Transliteration does not help the reader understand the language. It only represents the approximate pronunciation, which, in some cases, may become quite misleading. For Japanese, there is an observable trend in transliteration rather than translation. Example, Japan’s 伝承歌劇団 is known as Densyou-Kagekidan, rather than “Traditional Opera” (translation). On the other hand, the Chinese people tend to translate things because words written in the standard transliteration system, the
pinyin system, would appear a little weird. Example, the original name of The Dark Prison Massacre is 暗狱戮尸, which, if transliterated, would be “An Yu Lu Shi”.
In my opinion, band members’ names should always be transliterated. As for band name and tracklist, it depends on the users predilection.
Half-width or Full-width?
Since the eastern Asian languages are written in blocks, the characters are wide than letters; they are called full-width form. It is possible to write English in full-width form, so one can get “Metal archives”, so ugly. Also, in Japanese, it is possible to write hiragana and katakana in half-width form, but this is very rare.
So the rule is: for letter-based languages, use half-width form. For Eastern Asian languages (Chinese, Japanese and Korean), use full-width form. In most cases, the users need not to worry about the correct form because the correct form for a language is also the default form, unless it is changed manually.