Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
Towards Fully Automated Manga Translation (arxiv.org)
144 points by polm23 on Dec 29, 2020 | hide | past | favorite | 75 comments


The quality of a translation makes a dramatic difference in the reading experience. I have 4 or more copies of the art of war, each with different translators. Dante’s inferno, the Illiad, Dostoyevsky, Beowulf, Confucius: some translations are unreadable and one or two incredible. Automation is just not going to produce the really good translations. The only benefit of a technique like this will be to generate unlicensed translations of work that will not otherwise get translated, which isn’t even that big a deal because the communities already produce unlicensed translations of just about everything.

Edit: The other use is to help create better context aware data scrapers that can combine bi-modal information streams and add some body language understanding. I guess it will probably end up in automated surveillance tech like security cameras/mics etc if it works.


I can think of quite a few manga/anime/videogames where a machine translation would've been acceptable because I just needed a bare minimum understanding of what was going on. Think of any number of 8-bit games where the game is 95% hack-and-slash, but a random villager tells you before the last boss fight, "You must make sure to wear the enchanted amulet in order to pierce the dragon's scales!"

It's one throwaway line, but if the line is in Japanese, an American playing an imported ROM might spend an hour of frustration wondering why his sword does zero damage to the final boss.


The nes had plenty of real examples where poor translations made the game harder or made secrets obscure and hard to find without a guide. Castlevania 2 and Adventure of link come to mind.

Actually, i'd say most text heavy games translated from japanese on the nes suffered from this problem and made those games way more confusing than they should have been.


Sure...but I don't feel like we're comparing apples to apples here, since I explicitly mentioned text-light games and you brought up the most text-heavy, RPG like entries in the respective CV and Zelda series (of that generation).

To boot, Simon's Quest also had the atrociously bad idea that NPCs in the game can lie to you and give intentionally incorrect information, making the translation effort that much more confusing.


If a low-quality machine translation is sufficient to make those games playable because they're text-light, I question whether that's actually any better than having a human who knows the language spend an hour swapping the text out instead. You'd be surprised how often machine TL will muck up things like menu options or item names. The appeal of automated TL makes way more sense if it means saving hundreds of hours of localization work (not that it does...)


mostly because it would take more than an hour and involves a translator interested. I sure can't speak Japanese (yet), but I have enough technical chops to spend a few hours getting something workable out with this alone.

Then from there maybe I can trigger Cunningham's law and get the attention of someone who knows what they are doing. Sounds like a win-win for me.


"You must wear a desirable necklace to surpass the balances of the great lizard"


A bad translation can improve bad writing. By using unfamiliar phrasing and word choice it decontextualizes, allowing the player to imagine meanings and nuance where it isn't there.


I get what you're going at, but I think it fundamentally misunderstands something.

If the writing is bad, there's nothing that can really "improve" it other than the original author cleaning it up with the help of an editor. A bad translation is effectively a new work, at best inspired by the original bad script. You could replace a bad Japanese script with a "good" English one - this has happened before - but at that point it's questionable whether any translation has happened at all, you're mostly writing new content inspired by the original work or adhering to broad constraints. What I'd say you're doing here is improving the experience of playing the game, but you haven't done anything meaningful to the writing or script.

In a few cases western companies have licensed Japanese works and spliced them together with entirely new plots for overseas audiences - Robotech is one infamous example where arguably there was nothing wrong with the source material and the result wasn't just a liberal translation.


> If the writing is bad, there's nothing that can really "improve" it other than the original author cleaning it up with the help of an editor. A bad translation is effectively a new work, at best inspired by the original bad script. You could replace a bad Japanese script with a "good" English one - this has happened before - but at that point it's questionable whether any translation has happened at all, you're mostly writing new content inspired by the original work or adhering to broad constraints.

What distinction are you drawing between working with an editor vs working with a translator? Often it's a very similar process, and there are cases where something is cleaned up in a translation and then that gets incorporated back into the next edition in the original language.


Typically the translator is not working with the author and they're not involved particularly early in the writing/publication process. They often come to the work months or years later. There are certainly exceptions, though.


One very amusing example of that being http://winterson.com/2005/06/episode-iii-backstroke-of-west.... (there is a fandub too, watch it if you have the time...)


This could be good for games, but I think any manga/prose that's simplistic enough to be boiled down to purely functional phrases like "wear the amulet to fight the dragon" would probably not be worth a read...


“Shorter/simpler/obvious the sentence is the easier it must be” isn’t actually the case with translations.

UI strings being short usually means hidden heavy context lies in visual elements, so it’ll just strengthen hilarity in mistakes like “Name: SQL Server, Province/Prefecture: Running” (because you know, equivalents to provinces in a region are called “State” in American English...).

“Province” is more or less harmless, but “(has/is/is in/to/like to)Start(ed/ing) type of errors due to missing context can make UI unusable. Oh and it’s un-spottable by non-speakers because they make sense when translated back to original languages.


People read entire machine translated novels. It's not too uncommon with Chinese xianxia/xuanhuan novels. We're talking thousands of pages. It's not as incomprehensible as you might think. I didn't find it enjoyable though. I can certainly see a place for automatically translated manga. It's not a sellable product though.


    I didn't find it enjoyable though
Sure, yeah - I think we're on the same page. My post was a reference to enjoyability more than functionality.

I have definitely read novels that were long, yet rote and simplistic even in their native language. =)

But they were not works worth reading in any language IMO. They could probably be satisfactorily machine translated (with some human editing) but the result would not be enjoyable except for ultra diehards of the genre who are simply happy to be reading a work from that particular genre, quality of prose be damned.

Those enjoyable xianxia/xuanhuan you mention were either rote and boring in the first place, or they were wonderfully written and had the life crushed out of them by a machine translation that dispensed with all nuance.


Whole world, except English-speaking countries, have/had exactly this experience.


I think the sweet spot is machine-assisted translation.

If you can get the output to be 90-95% correct, you can then display the raw and the machine output side-by-side, and have a human make corrections inline. Instead of a team of four working around the clock for a day or two, maybe you could have a translation as fast as it takes to proofread it three or four times end-to-end.

Rev is the same idea in the speech transcription space -- they have humans listen to the audio and fix up a machine-generated transcription.


Machine-assisted translation is deceptively bad, especially for fiction content like manga. The machine TL might mix up the order of a sentence like 'bob verbed alice', fundamentally changing the meaning, alter context, or omit implied information. All of these errors will produce english sentences that appear sensical and valid and will get past most editors and quality checkers unless they're familiar with the source language or very familiar with the work. This sort of error even creeps into official human translations of anime/manga in some cases when the translators are working quickly without a good editor (i.e. simulcasts where they have tight deadlines and low budgets)

In practice if you look at the fan translation community for manga machine-assisted translation is not given much more respect than machine translation - they both produce bad results and in many cases the people who normally welcome even a clumsy translation will reject machine TL and attempt to have it removed, because it often causes people to fundamentally misunderstand the work. The worst cases of machine-assisted TL in manga become infamous to the point of becoming shared memes - try googling "abaj" or "duwang" sometime.

For a very simple pervasive example: Japanese frequently uses gender-neutral pronouns and when translating to English you'll need to appropriately select the right gendered pronoun (or proper name) for each one, if you can. This is something a human can do pretty accurately if they have enough context and knowledge of the material, but it is nearly impossible for a computer to do it accurately without a ton of assistance. In a novel this would be an easier problem because all the necessary context is in the text instead of in the art and panel layouts. You'll note this arxiv paper intentionally cheats on the gender problem.


I've seen some people run visual novels through automated translations in aggregate like this: https://static.wikia.nocookie.net/muvluv/images/8/89/Be_a_go...

You can definitely get some of the gist in there, but some of the automated translations are just way off. And pretty much none of them result in good prose. In addition, none of the translations get the names fully correct, so you definitely need someone to go and fix it.


I think you underestimate the amount of work that's required when "proofreading", you need another translator that could have done the work themselves. And it's not like you can just exchange a few words here and there and consider it done. If a human has to make corrections they'll likely rewrite the entire sentence and then the advantage of machine-assisted translation is pretty much gone. Overall you'll save time but the amount of time saved is probably less than what we think.


I think I’ve come across a PSA about this scheme. The story was that unaware translators puts discount on proofreading, so some clients send in dummy machine translation to qualify as “proofreading” task to get full translation at proofreading pricing.


You can’t get output to be “90-95% correct” from machine translation.

People think of language translation as some sort of same dimension transformation but it’s more like re-projection that involve rotation in upper dimension. Simple warping goes only so far, neural networks give some uncanny slurries, human artists add a lot of their own brush strokes and it’s a lossy process both ways.

Speech transcription is much more straightforward because speakers are supposed to have corresponding single literal expressions for each segments of voice.


I am not sure if you have looked at the paper once, but they talk about context aware translation only.


I've read a few Light Novels (no text bubbles, little to no pictures) that were machine translated, and they've all been horrible to read through - you could barely make sense of the sentences, and the whole reading experience became frustrating rather than pleasurable.


I think the potential of this tool, like many industry tools, is to streamline the translation process, not completely automate it. I can see this speeding up the process and lightening the load on stuff like typesetters, even if the final translation isn't perfect.


> which isn’t even that big a deal because the communities already produce unlicensed translations of just about everything

I think those community translators will be very happy to have some of their work automated.


In translation communities the idea of using automated translation even in part in the process of translating manga is frowned upon (good luck if you try to get away with doing this on a paid gig), and even the people pirating manga on websites tend to actively dislike automated translation as well. Heroic efforts from an excellent machine translation stack and a team of quality checkers and editors can help clean it up, but at the end of the day you'd probably save time and money by just hiring someone fluent in both languages.

It's always interesting to see research perspectives on this. I think there's considerable room for improvement, especially given that current translation tech doesn't take into account things like panel layout, horizontal/vertical layout of text, different typeface use, hand-written vs. typeset text, and mixed katakana/hiragana/kanji usage - all things that an author may use to convey tone, implicitly identify the speaker, or add subtext. There are also things with no equivalent in English literature/comics whatsoever, like using furigana to attach a second reading (or dual meaning) to a word.

I can imagine some of this info eventually being pulled in by built-for-purpose MTL tools if a research group puts in the time and energy to do it - the paper appears to be a couple brief steps in that direction because they're feeding in basic information about panel layout and the genders of the people in the panels, but little else. Excited to see what happens in this space even if the idea of even more people trying to translate dense comic prose with Google Translate fills me with dread.

EDIT: I should have done more research before posting this... digging into the authors of this paper, two of them are working at a company that is trying to sell this current technology to authors right now despite the fact that it is not adequate for the challenges of translating comics to other languages. How depressing.


I added OCR into my Japanese dictionary app Nihongo recently, with manga as one of the key use cases I was thinking about. I came up with an interface I don't feel like I've seen elsewhere, where you just tap on words to look them up, so it doesn't get in the way when you're trying to read.

I'd be super curious if you think this could be a useful tool for translators. Here's a demo video if you want to see what I'm talking about:

https://www.youtube.com/watch?v=ffRxPyc9K8A


Can you use a list of files in a folder, or zip file instead of using the camera? Plenty of artists sell their manga in patreon-style websites and I would like to read them.


> It's always interesting to see research perspectives on this. I think there's considerable room for improvement, especially given that current translation tech doesn't take into account things like panel layout, horizontal/vertical layout of text, different typeface use, hand-written vs. typeset text, and mixed katakana/hiragana/kanji usage - all things that an author may use to convey tone, implicitly identify the speaker, or add subtext. There are also things with no equivalent in English literature/comics whatsoever, like using furigana to attach a second reading (or dual meaning) to a word.

I also can't imagine how it would deal with the idiosyncratic, occasional-nonsensical English that some manga-ka's like to sneak into their work:

http://2.bp.blogspot.com/-Leelb4eEGz0/UzALTIDcgvI/AAAAAAAAEX...


> In translation communities the idea of using automated translation even in part in the process of translating manga is frowned upon (good luck if you try to get away with doing this on a paid gig)

Plenty of paid translation jobs nowadays explicitly hire people to clean up machine translation rather than translate from scratch.


That's doing the job you were paid to do. Being hired to translate something properly from scratch and turning in edited Google Translate output is dishonest at a minimum.


> That's doing the job you were paid to do.

Sure. My point is that it's far from being a taboo practice in the professional translation world. It's a normal practice that is part of a lot of professional workflows.


I've been a volunteer manga translator in the past, (also started getting paid for it after a while when our manga translation group started getting funded) and a large part of it is using google translate or translation tools to help out with definitions or look for inspiration in translating the words in the manga so this could likely help out in that regard by saving the need of tools like google translate


Everyone I've talked to (including myself) who have to translate between Japanese and English do the same process. Let the machine give you a baseline, then clean it up where it has it makes mistakes/has a weird tone. Though Deepl does a pretty good job at matching tone/level-of-casual-words if you phrase your input English correctly.

Human + machine assisted tooling is probably going to be the clear winner for a while when it comes to doing quality translations. As far as fully automated goes, I think it depends on language, but for Japanese it's like 80% there.


Yes, AI translation tools could produce starter text that a human could fix later. I assume the text is placed in speech bubbles manually? Perhaps in something like Photoshop? In that case, AI can also place editable text boxes at the speech bubbles, saving more time. I'd love technical details for how you translated manga!


I consider myself able to read manga in Japanese, yet certainly encounter a few words each chapter which I have to guess the meaning of. So I can see how this would help. Yet my impression would be that this isn’t really that time consuming? I would think typesetting would be the most labor intensive by an order of magnitude.


Out of the steps involved in a typical translate+typeset of a manga page (I've been personally involved in fan projects and have friends/family who do this commercially), I'd estimate the most time-consuming parts are:

2 - 4 Translation

1 - 2 Editing

1 - 4 Cleaning

1 - 2 Typesetting

1 Quality Control

The effort involved in cleaning entirely depends on how good the source material is - you'd think professionals would get pristine high resolution pages without text on them, but you'd be wrong. A high quality cleaning job often means repainting a bunch of the art from scratch. If the cleaning is done poorly the typesetter is screwed.

Some stuff is just plain easy to translate, other works will have a single page full of things you have to look up and sentences that only make sense with previous chapters as context (or even worse, FUTURE chapters as context). In some cases translators also have to first transcribe the work because they're sent blurry jpegs or png files instead of being sent text (again, you'd think this wouldn't happen, but...)

QC and editing can often be done by the same person. One subtle gotcha is that your editor or QC (ideally both, but at least one) REALLY need to know the source language even if they're deferring to the translator - if only one of you is able to check the A->B it becomes very hard to spot subtle problems that can mess up the work.


I was thinking the same thing. This generally wouldn't replace human translators in a worthwhile way (in the way that grammar- and spell-checkers and even GPT don't replace human writers) but it might help with some of the drudgery.


I have a friend who worked as a professional translator and that's exactly what he did. He put the text through Google translate and fix any errors.


Forcing externalities improve cost efficiency


How were you paid? Patreon?


What a great keikaku!


To readers that do not understand the joke: in a fan translation of the anime Death Note, the translator decided to translate the line "Just as planned" as "Just according to keikaku" with a translators note above it, saying that keikaku means plan. While back in the day, japanese words were often included with translators notes, this wasn't needed here and obviously a joke. This was only done when the word carried special meaning that couldn't be accurately translated using english words. Honorifics like -san or -sama were prominent examples.

(I kinda miss the times when common japanese words were not translated. Current translations often look weird (esp honorifics) and since the rise of Crunchyroll and subsequent demise of fansubs, there isn't much choice left for different translations)


> This was only done when the word carried special meaning that couldn't be accurately translated using english words.

Or when the translators or fans were weebs. E.g. many Death Note translators didn't translate "Shinigami", even though not only is "Death God" a perfectly valid western cultural concept, the Japanese cultural concept is actually directly derived from the western one.


> and since the rise of Crunchyroll and subsequent demise of fansubs, there isn't much choice left for different translations)

Two of the biggest names in ripping Crunchyroll have both shut down, one very unexpectedly, and I've noticed a small increase in fansubs since then. I'm kinda hoping for a resurgence now.


Some translators that still insist on localizing honorifics is funny, stuff like using Sir or making up a nickname for a character. I wonder what the Japanese do when translating English source to Japanese, do they insert honorifics where it doesn’t exist in the source?


Correct me if I'm wrong but usually honorifics are not used with non east-asian sounding names.

Source https://youtu.be/5rOHpkpYMIM


As somebody who never followed the fansub scene, the rise of japanese words in common English web-lingo (like desu or senpai) was really weird to observe.


tl: keikaku means plan


lol

All according to keikaku!


I love it when a keikaku comes together...


The best laid keikaku of mice and men...


My friends built this Chrome Extension that uses ML to help with the translation process.

https://chrome.google.com/webstore/detail/manga-translator/o...


For Firefox users, here's an alternative: https://addons.mozilla.org/en-US/firefox/addon/anity/


I gave the paper a quick read and there's an impressive amount of context awareness. However one particular thing it seems like it wouldn't handle well would be ideosyncratic speech that is particular to a specific character.

examples:

- a character who uses clunky or outdated forms of Japanese

- a character meant to have a rural account

- a character with a speech affectation, like adding a cutesy -nyo to the end of their words/sentences

- etc

Perhaps these could be improved if the model was trained on human translations for that specific character and weighted appropriately. With some manga running for dozens or hundreds of volumes that would feasible.

It also definitely wouldn't help with puns, cultural references or other subtleties!

BUT, this could still be really useful as an aid to human translators.


To that list I would add characters who speak in katakana - at least Google Translate doesn't seem to be aware that this is a thing.


I've been thinking about this problem for a little bit. I think an app idea that has potential is a sort of 'wikipedia for manga translation'. Essentially you have some automated translator do a first pass, create rough translations. This is served to customers. All the readers, then, have the ability to move around the overlayed speech bubbles and edit the text inside them. For the most part, automated translation seems to capture a lot of high level stuffs, and the large number of community participants can fill in the details/broken translations using context, and additionally do whatever cleaning needs to be done.


Something similar has already been done years ago, see the infamous GTO scanlation (exmaple: https://www.reddit.comhttps://www.reddit.com/r/manga/comment...). Decent manga (and comics in general) translation requires a "redraw" step where original text is completely removed. Some titles keep text strictly in bubbles, but most have text over the artwork to a varying degree, from simple (minutes per page) to highly non-trivial (hours). An image inpainting algorithm that removes text from manga would be really impressive and save a lot of effort.


Good automated (or even partially automated) inpainting would probably be a godsend for typesetters, since they can spend a considerable amount of time on a single page if it has lots of text or detailed illustrations. I imagine the publishers that put out multiple print volumes a month would probably pay good money for it, but it might still be too small of a market to build commercial software for...

In the worst case even skilled typesetters can spend an entire day cleaning up and typesetting a particularly nasty page if they have to repaint lines and blank out dozens or hundreds of kanji (it happens)


Worth checking out Clyde Mandelin who used google translate on the Japanese version of Final Fantasy IV and documented the silly results.

https://legendsoflocalization.com/funky-fantasy-iv/


There is something almost Shakespearean about some of those sentences. I guess it's the combination of regular English words with highly unusual grammar.


I wonder when/if/ever context aware MTLs will view onigiris as jelly donuts. Humans already have an issue with using terms or something such that does not exist within the manga world such as referring to "china" (Chinese ceramics) or Americanizing manga to the point where it makes even less sense to the reader (unless perhaps if you're American? As can bee seen with the mentioned above).

Same with people who ̶t̶r̶a̶n̶s̶l̶a̶t̶e̶s̶ ̶W̶e̶s̶t̶e̶r̶n̶i̶z̶e̶s̶ Americanizes suffixes because I'M PRETTY SURE that EVERYONE calls their classmate MISS or MISTER. And the same goes obviously towards pet/nicknames. Heck, everyone addresses their older sister/brother as big sister/brother and NOTHING ELSE right?


The onigiri/jelly donut thing is not particularly uncommon if you try to shove complex text or dialogue through Google Translate and similar MTL systems, because they are trained heavily on existing text. If the japanese string you plug in is uncommon enough you may get back imageboard slang, text from video game wikis, or racial slurs.


I'm pretty sure the onigiri/jelly donut thing only happened because 4Kids didn't think American children would know what onigiri was - they were infamous for their terrible localization.

Google Translate translates it more accurately to "rice balls" today - something as simple as that wouldn't be a problem for machine translation.


That's assuming the training set is all of the media that mainstream Google Translate was trained on. The paper talks about more specialized training sets so it's quite possible something like onigiri could end up under-represented (though I suspect it's common enough in manga that it would be something more obscure). Naturally the "jelly donut" part would only slip in if they used untrustworthy data like forum posts or 4kids localizations, but that's exactly the kind of data that Google Translate and similar algos scrape from the internet for training.


That means an overreliance on the word bruh if we go by current standards. Hopefully it does not come to that but if current AI trends is anything to go by...


A free trial version would be nice. BTW, the nitpicker in me wants to say that sound effects were not covered, only text in speech balloons was caught, as can be seen in their very first picture.

If it can do visual novels too, I'm in.


I think people are going to nitpick no matter how sound effects are handled. They are almost as much art as text. It is clearly out of scope for this paper to be able to "redraw" the sound effects in English, plus, some (I'd guess most, but importantly, not all) might want to see the original anyway and have a note somewhere for it's translation. I also think it's out of scope to find a suitable place to put this note - you don't want text to cover up interesting art.


Yeah, a common general-purpose solution for this is to just put hotspots over all the non-speech-bubble text that you mouse over (or tap) to reveal localized text for. It really sucks, but there's no obvious general solution since the size gap between japanese and english text can be huge. Naturally, those hotspots aren't an option in print or pdf editions...


A common solution is to add footnotes and sprinkle them around the panel gaps.


https://mantraviewer.datada.repl.co/

to see the English translation over the manga.


Another machine translator polluting the internet with garbage text.


wow - these guys got Yahoo Japan to sponsor their manga addiction, genius!

:-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: