Monday, March 5, 2007

What Goes Into A Fansub (aka. an idiot's guide to fansubbing)

Fansubs, as defined by Wikipedia, is
a copy of a foreign movie or television show which has been subtitled by fans in their native language. It is most commonly used to refer to fan-translated anime videos that are shared amongst other fans.
To put it slightly better, fansubs take a show that has been aired in a non-native language (and usually in a non-native country), translate the dialogue, put the subtitles containing the translated dialogue with the captured video, and [re]distribute everything to a native language-speaking audience. In the case of anime, after the video of show has been captured during airing, the dialogue is translated into English (primarily), the subtitles are put with the video, and both the video and subtitle package are redistributed to the English-speaking audience.

The total fansubbing community that exists today for anime is notably large for an online community. As was mentioned in my last blog entry, the number of people in the fansubbing community who download fansub releases is not insignificant, and neither are the number of people who are involved in the fansubbing process to create the very releases. In order to be able to fully describe this somewhat convoluted network of people, probably the best way to describe the whole community is to "classify" the community in a three-tiered hierarchy of sorts: staff, distro, leechers.

Staff: These folks are the very people who create the fansubs. While process varies from group to group in how the original product gets to the final product that is the fansub, there are core similarities among all groups that go into a fansub release. (Here, I will address a general overview of the process, and the people involved with the fansub and their jobs. I will address the specific details of each "job" in future posts.)

Almost always, fansubs are coordinated on internet relay chat (IRC) where various staff members communicate among themselves to get the work done. In almost all fansub groups, more than one person works on any one series. In some projects, a few people may take on more than one job, depending on staff availability and the person's skill level. However, I will go through the different jobs that is involved with a typical fansub production.

Raw Getter/Provider: When the original show is broadcast, the show is captured and encoded into a "raw" format (or untranslated, original video capture). In the case of anime, the raw is captured from a television stream in Japan. The raw is usually thrown up on a file sharing network (Winny, Share) and the fansub group grabs this raw as the start for the production of the fansub. Depending on the group, there may be a person who is specifically dedicated to acquiring these raws, especially if especially clean raws are desired by the group or project leader who is in charge with the logistics of the project. Other times, a random raw will be obtained by another staff member without regard for the quality of the video.

Translator (abbreviation: TL): The job that the translator does is self-explanatory: translation. The bare minimum the translator will do is dialogue translation, which is the translation of what's spoken in the show. Depending on the translator and the group, background dialogue or voices, and signs that appear may be translated. The translator is the real driving force behind any fansub group, since only when a translation is available can the rest of the fansub happen. Translators are the most important resources in fansubbing, since it is a very specialized job, requires some form of fluency (which takes time and resources to become fluent), and decent, reliable translators are very difficult to find, much less recruit, since there are so few. The translator produces a script (usually an electronic .txt file) that contains the translated dialogue and other relevant mater called a TL script.

Translation checker (abbreviation: TLC): Translation checkers are also self-explanatory: they check the original translation in the TL script to see if the translation is accurate. Obviously, familiarity in Japanese is required. The TLC is used only when needed, since translators who are exceptionally fluent in Japanese and English should be able to produce accurate TL scripts. Sometimes, the TLC (aka translation check) occurs after timing.

Once the TLC has gone through the script, the following route the script takes varies from group to group and the staff that works on the series. However, the same core people are involved with the handling of the script.

Editor (abbreviation: Ed): As is usually put by many in the fansubbing community, editors take the Engrish from the TL script and molds it into English. These guys handle all rewordings of dialogue as needed, including grammar and spelling corrections, dialogue pacing in text, and characterization of dialogue (eg. giving a rough edge in the subs to characters. Depending on the group, the editing phase can occur before or after the timing phase. An edited script is produced.

Timer (abbreviation: T, though rarely used): The timer has the task of creating an substation alpha (.ssa) or advanced substation alpha (.ass) script based on the .txt dialogue script. Typically, a program like Aegisub is used to help with the task to produce a timed script. The timed script contains code that displays the subtitles at the appropriate times with respect to the audio and video when characters are speaking. Timing can occur after or before editing and translation checking.

Typesetter (abbreviation: TS[er]): The typesetter applies subtitle styling (font style, font size, subtitle coloring, etc) to the dialogue subtitles; and creates text and uses styles text via .ssa/.ass override commands associated with various signs that many appear in the video, as indicated and translated by the translator. Typically, typesetting occurs after translation checking, timing, and editing, and before encoding.

Before encoding occurs, other jobs are done to the episode or series to various degrees depending on the group.

Styling: Styling involves the font selection, font sizing, and font coloring of subtitles. Styling also occurs for various signs including episode and next episode titles, series title (if AFx isn't used), and signs that occur in various scenes in the episode. Typically, styling is lumped under the typesetting department, since it is also a one-shot job for a series. However, there have been credits for styling that weren't performed by the typesetter. Styling is one of the three primary factors (along with encoding and translation) that leechers use to decide what series to download and keep.

After Effects (AFx[er]): Some signs and text effects cannot be reproduced practically and accurately without many, many lines of .ass/.ssa in the script. Depending on the group, AFx may be used as a substitute to direct script typesetting. AFx may be used consistently for each episode. In this case, AFx falls under the typsetting department. However, there may also be AFx specialists who focus on signs outside the typical simple sign and subtitle job requirement that require particularly large amounts of FX work.. AFx is also used to create "logos," basically a group videostamp that occurs with the opening title sequence to the show, including whatever effects that may be applied to the original opening. Logo creation is generally done only once over the span of a series. However, some groups may completely overlook AFx in order to keep the subs simple and to release the fansubbed episode faster.

Karaoke: Most fansub groups include a karaoke of the opening and ending songs associated with the series. Karaoke requires separate translation, timing, styling, and application of effects and separate staff from the typical episode work may work on the karaoke. There may be specialized karaokers that will work on karaoke, or several people (including the timer and TSer) will work on it. Depending on the group or the people working on it, AFx may be used for the karaoke to apply special effects that would be difficult be implemented through .ass/.ssa.

Once all of the basics have been done to the episode, the episode goes through a quality checking (QC) and encoding phase. The encoding/QC order will vary by method and order, but here's the rundown on both jobs:

Encoder (abbreviation: En[c], rarely used): The encoder is the guy who puts everything together after all the work is done into a final release. The encoder also touches up raws as needed before the final encode to make it look cleaner and nicer while dealing with various raw quirks, like 120fps raws, h.264 in .avi containers, and the like. The encoder also deals with encoding "artifacts" and various other problems that may arise in the video. Audio processing and reencoding also falls under encoding territory. On top of his work towards the encode that becomes the final release for the group, the encoder may also create a "working raw" for the staff which works as a template on which all work will be based on: scene timing, sign positioning/coloring, etc. Depending on group/project procedures, the encoder may also make an QC encode (basically an encode of lower quality compared to an actual release) for the QC phase, though there are groups that do softQC, which is basically QCing the episode with the pre-QC script and a work raw. Some groups also have the encoder make high quality encodes of fansubs before a release called release candidates (RC). These RCs are potential release versions, but if there are problems with them, the scripts are altered and the encoder reencodes another RC.

Quality Checker (abbreviation: QCer): Quality checking is the phase where all errors are weeded out of a fansub before a final release. QCers check for all possible problems with the fansub that would fall under all categories of the fansubbing process: translation mistakes or inaccuracies, editing mistakes like grammar and spelling, timing mistakes like mistimed or missing lines or sub bleeds, typesetting mistakes like bad fonts, colors, sign positioning and movement. and encoding problems like blockiness, ghosting, embossing, double-exposure frames, artifacts. However, the very basic function a QCer must be able to do if one is to be worth his/her salt is grammar- and spell-checking. The QCer does not usually alter the script that has been used, but rather produces a QC log, which is a record of all mistakes he has noticed or think is a mistake, which is then applied to the script towards the finalization of the release. (The reason why will be addressed later on an article on the standards of QC.) While QCing requires a familiarity with the work that goes into the fansub and familiarity with all the possible problems with a fansub release, QCing is also a good entry-level position in a fansub to start out as to learn the ropes, learn what the "standards" for each job is (which will be discussed in a future entry), and to eventually branch out into other positions in fansubbing. QCers may also check the episode again in RC checks before release.

Once the RC has been approved for release, the episode is considered finalized and ready for distribution. This is where and when the episode hits the second tier of the fansubbing community.

Distribution (aka Distro): Distribution of a fansub release occurs on three levels with distribution. Distro, at least in how I would define it, is aimed strictly at mass and efficient distribution (uploading) of a fansub release.

XDCC bots/providers: Since fansubbing is centered around IRC in the development of fansubs, fansub groups also have IRC channels where they and other various people also hang out/idle/lurk that may not be group staff. IRC is also one of the first places where the fansub release will end up in--and the fastest way to get it. XDCC bots are programs (like iroffer) run on computers that log on to IRC with the strict purpose of distributing a fansub release to people on IRC. Downloading is basically like a straight WWW download or a straight file copy. Typically, XDCC bots are donated by people (legit bots) to strictly serve a group's (or multiple groups') releases. Some bots may be donated from college students where their .edu lines can afford very high bandwidth speeds, or from people who own/rent servers that can also distribute the files for them. The latter, though, requires a monthly fee, which is generally from donations from people who download from those bots. There have also been stories of hacked bots, where computers that have had the bot software installed without the owner's knowledge of it happening.

The pros of distributing and obtaining releases through XDCC is the potential for extremely high rate of speed the transfer can occur. On an .edu line, I have been able able to get speeds as high as 3MB/sec! Another pro is that the ability to set up the bot fairly easy. Also, if you do downloads one at a time from XDCC bots, you don't get hard drive fragmentation that occurs with torrents. The cons of XDCC is that generally, unless there's an archive of releases, the XDCC bot only has recent releases, so it's only a temporary initial method for distribution. Two other cons are that the XDCC bot can only upload to so many people and that the bandwidth is only as good as the connection to the bot. Some bots I have download from can be as bad as downloading on dialup. Despite the fact, XDCC is my preferred way of downloading... at least when the traffic isn't too bad on a bot.

Torrent/Dedicated Torrent Seeders: Torrents use a distributed peer-to-peer (p2p) protocol that downloads the file piece by piece in no real specific order. Torrents require a tracker to keep track of everyone who is downloading the file and a client like utorrent to be able to open the .torrent file and download the data pieces for the file. Not only are there trackers like http://a.scarywater.net, but there are some sites like Tokyotosho and Animesuki that list the different fansub torrents. This probably the simplest and largest mass distribution method of downloading anime. However, multiple torrent downloads result in fragmented hard disk drives. Torrents also require uploading while download, which can gimp download speeds if the user is not careful. One large downside to torrents is that it is a communal download, meaning that in order to successfully download a torrent, the equivalent of one copy of the file(s) that is being downloaded must exist, or a seeder must be connected. If not, a complete file will never be downloaded successfully, unless a seeder gets on. In order to prevent such a scenario from happening with a fansub group's releases, dedicated torrent seeders in the distribution channels of a group are given the task to keep full copies circulating on the torrents. Torrents are especially good for small fansub groups who do not have any XDCC bots to help with archiving and distribution.

Direct Download (DDL): A relatively new method of distribution has appeared in the fansubbing scene: website-based episode downloading. Like XDCC bots, DDL is basically a straight download of a file through websites off webservers. The pros of the DDL is the speed of the download without the hassle of uploading. However, DDL speed is also limited by the connection. Generally, DDL costs a significant amount of money, since it's off of webspace. Secondly, those who host DDLs must be aware of when a fansub becomes licensed and remove the fansub from the server, or face potential problems like copyright litigation and C&D orders and general bile from the company hosting the site. Generally, I avoid the DDL, since I have problems with web browsers locking up and the like.


Once the file has been loaded onto .torrent, XDCC bots, and DDL sites, RELEASE!!! The floodgates open and people come in to download the latest release from the group:

Leechers: These people are generally those who are not involved with the creation of subs and distribution. Generally, the consumers of fansubs, who give nothing else back except for the little bandwidth through .torrents and file servers. Otherwise known as the "masses," they are the unseen faces that download the 300k copies of Naruto for personal use and watch the fansubs in anime clubs. They also donate small amounts of money to fansub groups and bot/webhosting to help with the costs of fansubbing, like group ftp dumps and webhosting; and distribution XDCC bots and DDL services. At times, they may donate money to help staff with various hardware issues (especially if they're college students), but such occurances are very rare. Leechers may also help with the distribution of various anime releases (as well ad music/soundtracks) through file servers (fservs). However, these are secondary distribution methods that are not run or coordinated by the group, and fservs are usually very slow, which does not fit the fast distribution methods that true disto is. Usually, file servers (plugins to IRC chat clients) are used as ways to get special voice statuses in IRC channels, which can aid in obtaining priority queueing and file downloading in other file servers.

But in all seriousness and honesty, fansubbers like me are glad to be able to work on shows and to share our love for the various anime series which are released every day. We fansubbers do thank the leechers and appreciate their support during the course of our work. There are a few rotten apples who demand their weekly crack, but in general, they're obedient little boogers ;o


Now, the legal part to fansubbing.

Yes, fansubbing is illegal, because it involves taking original and copyrighted works from Japan and re-releasing them (albiet in an altered form) to a general public. What makes fansubs able to operate nothing short of impunity is that the United States government has no obligation ton enforce copyright in the US since the animation companies are foreign, and the Japanese comapnies that produce the anime in Japan must file lawsuits in the United States, which would involve hiring expensive lawyers to do work in the US. However, with fansubs, there is an gentleman's agreement between fansubs and the companies that license the fansubs that the licensors do not persue after fansubs with legal actions as long as the fansubs stop subbing and distributing the licensed work. Most fansubs had historically complied with this agreement, subbing unlicensed shows only, but a larger number of groups continue to sub and distribute after licensing. At times, distribution compines like Bandai and, infamously, ADV send C&D (Ceast & Desist) letters/orders from lawyers to make fansubs stop. However, the general tolerance and symbiotic relationship between fansubs and American distribution companies has allowed for the fansubbing community to survive for so long.

No comments: