Saturday, September 1, 2012

The Waiting is the Hardest Part

To be honest, I haven't made much progress on this undub in the past month, as I've been busy with other things. This is a situation that will soon be corrected. But for now, I thought I'd at least post another video showing off some stuff that has gotten done.



Note the FMV subtitles, which thankfully were a lot easier to get in the game than I expected them to be. Grandia II barely even has any dialog in the FMV sequences to begin with, so I guess I could have ignored those scenes altogether. But as I hope you've realized by now, this is not going to be the kind of half-assed copy and paste audio file undub that you're probably used to.

Friday, June 22, 2012

Unfinished Business

My release of a Gurumin undub earlier this year was a nice little project to get my feet wet in the world of undubbing. But truth be told, there are games out there that are in much more dire need of a proper undub. One of those games is Grandia II.

Grandia II is a game I've wanted to play with the original Japanese voices since I first put it in my Dreamcast more than ten years ago. However, actually getting it done was another matter altogether. For one thing, there was no readily available way to get ahold of the original Japanese voice files when I first started looking into the project. I solved that problem by buying a copy of the Japanese version of the game and a Dreamcast Broad Band Adaptor so I could dump the files myself. The next roadblock was the lack of a setup to efficiently patch the game and test changes. My Ruby-ISO-9660 library and NullDC together allowed me to achieve an efficient setup. And as a bonus, it will also allow me to make my final patch function with a GDI image of the game. The last roadblock was getting the game's script patched to use the timings from the Japanese version. Failing that, the text wouldn't sync up with the script in the scenes with voice acting, and in many cases the voice samples wouldn't get a chance to play completely. This problem was solved for me by a hacker who wrote tools to unpack, decompress, dump text from, insert text into, recompress, and repack Grandia II's map files. A task that I'm sure was no easy feat. For what it's worth, I did manage to solve an issue he never figured out, which allowed me to fix the crashes that occurred when you increased the size of the text you were inserting. ;)

With all roadblocks removed, I was finally able to produce this:

Now because I'm actually taking the time to retranslate the text from the voice acted scenes with my limited abilities in Japanese, it's going to take a while to get a finished product. How long is hard to say, but I'm going to estimate I'll have the final patch done by some time between the end of the summer and the beginning of next year. If I rushed I might be able to get it done sooner, but I have a lot of other stuff on my plate, and there's only so much time I can devote to something like this. Anyway, if you've always wanted to see Grandia II undubbed as badly as I have, this is definitely something to look forward to.

Sunday, May 6, 2012

Maybe You Can Stop Internet Piracy After All

Before I get started, I'd just like to say that I am in no way endorsing this plan, or in any way suggesting it is a good idea. The only reason I'm even bringing it up is that it seems like such an obvious solution to limit the impact of piracy on the internet, that I'm really not certain why no attempts have been made to jam it down our throats yet. Maybe we can thank the fundamental lack of understanding politicians and lobbyists have about how the internet works? Either way, I thought I'd just bring up the idea in case anyone can think of a reason why it wouldn't work that I can't.

The idea is simply this: Why not require a license to operate an internet server? Piracy on the internet these days is primarily accomplished in one of two ways. Either by uploading pirated content onto a file sharing website, or by transferring it directly via P2P. If you were to require anyone who operated a server on the internet to be licensed by the countries in which the server was connectible, the first would quickly become impractical, and the second would be downright impossible.

In the case of file sharing websites, any site found to be operating a service dealing in large amounts of pirated content would have their license revoked, or at least suspended until they cleaned up their act. It would be pretty difficult to operate under the radar, because you'd need initial approval of your site to get a license in the first place. And even if you actually did manage to get a website up and running that dealt in pirated content, odds are someone would report you pretty quickly. Even a site that was inadvertently enabling sharing of copyrighted content could have their license suspended until they fixed the loophole that allowed it. As a bonus side effect, it would make the issue of operating an international website much less of a legal hassle, because you could choose which countries you wanted your site accessible in by only applying for licenses in those countries. There could also be some sort of global license for website operators who still wanted to enable worldwide access by default.

In the case of P2P sharing, you would need a license for other people to connect to you. These would typically not be given out to people on a consumer class connection, so you would not be able to engage in P2P connections at all. The only computers your ISP would allow you to connect to would be licensed servers.

On a technical level, such a system would actually be relatively easy and cheap to implement. It wouldn't require an overhaul or redesign of any existing systems. All you would need to do would be to implement a national white-list of IP addresses. ISPs could then add filtering rules that would make it impossible to connect directly to any server that wasn't on the list. Frankly, considering how much effort China puts into censorship, I'm shocked they haven't thought of this already. White-lists are not exactly a new concept, and they are extremely effective in limiting access.

Such a system would of course, not be perfect. It would for example, still allow people who did own servers to use them to connect to darknets along with other server operators and share pirated content that way. It would probably be possible to detect and track this kind of activity with enough effort though. But, for all intents and purposes, it would put internet piracy out of the reach of the average person. As far as the copyright lobbyists are concerned, that should be good enough.

The biggest hurdle to actually implementing such a system would likely be political. You thought SOPA got people upset? An internet server license bill would almost certainly lead to biggest uprising of internet activism the world has ever seen. Still, when you live in a country with a two party system, and both parties support the bill, the political danger either of the parties would find themselves in after passing it would be limited. I could envision a circumstance where the bill was muscled through at the beginning of an election cycle, and was put into effect long enough before the next election that most people would realize the law didn't affect them in any way. Politically, it could work if the timing was right.

I don't want to see a law like this passed. We have a top-heavy system controlled by a select group of individuals to begin with. The internet seemed to finally be putting a dent in that system, and requiring all internet server operators to be licensed by the government would almost certainly put an end to that. But I wonder, would we be able to stop a bill like this? After all the protests were over and done with, and the law was signed, would the people ever have enough political power to overcome those who have a vested interests in limiting our power again? Things like this keep me up at night.

Sunday, January 8, 2012

Going Open Source

Actually, I already included the source to my Ruby ISO 9660 parser in the file I attached to my previous post. I just thought I'd make the project official by creating a Git for it. You can check it out at github.com/An0Hit0/Ruby-ISO-9660. I'm not going to be doing further updates to the library in the immediate future, but it does already do several useful things, and it wouldn't be difficult to add more functionality if someone wanted to take on the project. Either way, I'm eager to see if anyone gets any use out of it besides me.

Tuesday, January 3, 2012

Oh look, I did something constructive.

It happens every once in a while, despite my best efforts. For the past week I spent some time on and off trying to get an old PSP game called Gurumin: A Monstrous Adventure undubbed. For those of you not familiar with undubs, an undub is a hack  designed to take a game that has been raped by horrible English voice acting and restore it to it's former Japanese glory. Maybe people make undubs for games that originate in other languages too, but I've never heard of one. Gurumin may not be the best example of a game to undub, but hopefully by reading this you might learn a little bit about how the process works, and hacking in general. I've tried my best to write this article so someone who doesn't have much experience with hacking can still understand it.

Gurumin is sort of an interesting case as far as undubbing is concerned. For one thing, it already has a feature to enable the Japanese voice acting. You may ask yourself, why would I try to undub a game that already has a Japanese voice feature? A couple of reasons actually. For one, you can't enable the feature without entering a special code, and the code can't be used unless you've already beaten the game. This is sort of a nitpicky issue I guess. You could just download a game save that's been completed and copy it to your PSP. Doing so is neither difficult nor complicated. It's just, something about the way they made Japanese voice acting such a hidden feature that only someone who had done some serious research would be able to figure out how to use it on their first playthough, I find distasteful. And besides, there's still the much more significant, although still somewhat nitpicky problem that even if you enable the Japanese voice acting, the voices that play when you are actually on the game field are still from the English voice set. This is just plain annoying. It's one thing to be forced to play the game in English only, it's entirely another to have the voice actors switching back and forth constantly. It doesn't make for a very coherent gaming experience. My only question was, could I fix it?

I will freely admit, I am not the greatest hacker in the world. I have some experience, but there is plenty of stuff that is over my head. Undubbing a game that is already more than half undubbed however, did not seem like one of those things. In many cases a successful undub can be accomplished just by moving some files from the Japanese version of the game into the English version, and rebuilding the ISO image. In this case the files I needed were already on the disc, which meant I could make an undub patch without even violating any copyright laws. It seemed like a pretty straightforward proposition to me.

The first thing I tried, was to use a well known tool for PSP ISO manipulation called UMDGen to simply overwrite the files in the English voice acting folder with the files from the Japanese voice acting folder. The game couldn't very well use the English voice acting files if they were no longer on the disc, and I knew the game engine could support the Japanese files since it already had an option to use them. What could possibly go wrong? After I rebuilt the image, the first sign of trouble was that the rebuilt image was about 100MB smaller than the original image. The second sign was that when I tried to run the game on my PSP, the game decided to hang at the loading screen about a second after it was started. Well shit, this wasn't going to be as easy as I thought.

Based on what I already knew about games that behave this way when they have their disk image rebuilt, it was highly likely the problem was that the game was loading it's data from locations that were hard coded into it's programming, instead of looking up the locations of files from the ISO 9660 header at the beginning of the disk image. When I rebuilt the image after overwriting files with files of different sizes, it inevitably forced UMDGen to move the data in the image layout around. The result of this was, the game was now looking for important files in locations where they no longer existed. Man, what a pain in the butt. Now I was going to have to do some actual work to make this thing happen.

The next thing I tried, which was perhaps a bit misguided, was to write some software that would allow me to make changes to the ISO 9660 header. While I did say I was pretty sure the game wasn't using the header, I wasn't completely sure it never used it. There was a possibility it was loading some files using the information from the ISO header, and others using other methods. If it ever, at any point used the ISO header to locate a voice acting file, that header data would have to be changed regardless of anything else I had to do. Plus, I figured that it would be handy to have some software that would allow me to easily script edits to an ISO image, and indeed this did prove to be the case. All I knew was, there was no way I was manually editing the 2053 file entries it would take to make the necessary changes.

It would have been nice if some software already existed to do what I wanted to do, but sadly this was not the case. Due to the nature of the ISO 9660 specifications, making edits to ISO images that don't require rebuilding the entire ISO header is tricky. And it's something that the format was never really designed to handle gracefully, since ISO images were never originally intended to be altered after they were generated. Needless to say, when I set out the write my library, I pretty much had to start from scratch. Lucky for me, the specification summary done by the good people at OSDev was more than enough to tell me everything I needed to know.

I decided to write my library in Ruby because... I like Ruby. Even though it really isn't particularly suited for messing with binary files, I like the fact my code will run on almost any platform, and the fact that anyone can easily make changes to and use my code for their own purposes without having to deal with a complex or platform specific build process. And Ruby is really fun to program in too. ;)

After spending several days coding on and off (a faster programmer might have written the whole thing in a day, but I wasn't exactly working on it 24/7), I had implemented enough of the ISO 9660 specs to give what I wanted to do a shot. Here was the code I used for the first step:
stream = File.open("gurumin.iso", "rb+")
iso = ISO.new(stream)

jap = iso.root["PSP_GAME"]["USRDIR"]["vag_jp"]
usa = iso.root["PSP_GAME"]["USRDIR"]["vag"]

jap.entries.each do | name |
  jap_entry = jap.entry(name)
  usa_entry = usa.entry(name)
  
  usa_entry.extent_lba = jap_entry.extent_lba
  usa_entry.data_length = jap_entry.data_length
  
  stream.pos = usa_entry.position
  usa_entry.dump(stream)
end
What this does is simple enough to understand. It takes the location and size fields of the entries for the Japanese voice acting files, and uses them to update the entries for the English voice acting files. After running this script on my original ISO I started up the game to find out... absolutely nothing had changed.

The reason why I don't consider myself to be the best hacker in the world is not because I lack knowledge or ability. It's because I lack patience. Hacking something often means banging your head on the wall repeatedly after you run into dead ends. In this case, I could have gotten lucky, and the game could have used the ISO header to load the voice acting entries. In reality however, it did not, and after working on the problem for days and accomplishing nothing, I was about ready to give up. But when I tried to sleep that night, I just couldn't get the project out of my head. I really wanted to put this one in the win column, and I still had a lot of ideas left. Even if I wasn't 100% sure any of them would work, I wasn't ready to give up.

The problem of course becomes that unlike when I hack things on the PC, I don't have access to the debugging tools I need to properly analyze how a PSP game works. I could load up the whole thing in a disassembler and go though it line by line (and I actually did end up needing a disassembler later on), but that would be way too time consuming to justify based on what I was actually trying to accomplish. No, I needed some low hanging fruit.

I decided to load up the game's ISO in WinHex for some further analysis. In my experience the first rule of hacking binary files, is that strings are pay-dirt. The first thing I did after loading the file was to look for instances of the name of the first file in the voice acting folder ("a00_001_") in the data. The first instances I ran into were unsurprisingly entries in the ISO header. But then I found what looked like a pretty interesting table. It had 16 byte entries, that followed a consistent pattern. The first 8 bytes were the name of a file or folder. The next 4 bytes appeared to be the size of the file. The final 4 bytes I couldn't quite figure out. I assume they had something to do with the location of the file, but there was nothing else in the table that would have indicated how to resolve the information in those bytes to an actual location on the disc. Since the table happened to have entries for every file in both the Japanese and English voice acting folder, as well as several other key folders in the game's directory structure, it seemed I'd found what I was looking for.

I figured out, by looking for an entry in the ISO header that matched the location of the data I was looking at, the table was located in a file called "aaaa.lst". Why would anyone put a table of file sizes and locations in some randomly named file instead of just looking up the files in the ISO header? ...I don't know, but you'd be surprised how often you find these type of questionable design decisions in the process of hacking something. I edited the file so that the entries for the English files would match the entries for the Japanese files, and patched the file into the ISO using the following code:
entry = iso.root["PSP_GAME"]["USRDIR"].entry("aaaa.lst")
stream.pos = entry.extent_lba * iso.lba_size
buffer = File.open("aaaa.lst", "rb+").read()
stream.write(buffer)
I skipped updating the file size in the entry because I didn't actually change the size of the file.

So I loaded up the ISO again on my PSP and the result was... nothing changed. The voice acting in the opening was still entirely in English, and working as if nothing had happened. Ok, seriously, what the hell? I had actually already gone through the ISO file looking for more references to a file in the voice acting directory earlier. Other than that table, there was nothing interesting. But more importantly, what does that file even do if changing it doesn't do anything? This is another one of those situations where patience pays off. I could have turned off the game at this point and never given it another look. But I decided to play through it a bit, and see if anything had changed. To my surprise, the field voices were now in Japanese.

If I had just been doing this for myself, I probably would have called this a win and put the project to bed. I had wanted to get the game to default to Japanese, but it was at least now possible to play the game properly with full Japanese voices by using a completed game save and the Japanese voice acting code. Really, that's what was most important. Except... I did want to make this a public release, and to make matters worse after playing into the game even further there was still some English voice acting left that played when your character used a heal point. I'm just too much of a perfectionist to let something like that stand.

However, I was running out of ideas. Not only had I changed everything in the game I could find that I felt would be meaningful to change, I was throughly confused about why the results I was getting were so inconsistent. How exactly was it that I'd managed to change every field voice, except one? And why was the cutscene voice acting so resistant to change? When I was editing aaaa.lst, I did skip over a few entries, as there were a few files in the English voice set that weren't in the Japanese set. That could explain why one of the voices wasn't changed. I decided to try the Japanese version of the game to find out if there was supposed to be a voice sample for heal points or not. As it turns out there was. And there were no files in the voice acting folder of the original Japanese game that weren't in the Japanese voice acting folder of the US release of the game. There was really only one thing left to do. I was going to have to disassemble the game's main executable and figure out what was going on.

The thing about PSP executables is, they're encrypted to prevent tampering and piracy. That would have been a show-stopper except for two things. Due to a mistake on Sony's part, early PSP games actually left the unencrypted version of the game's executable on the disc image. The unencrypted file is called "BOOT.BIN", and the encrypted version is called "EBOOT.BIN". But while the original unencrypted boot file is left on the disc, it is never actually used. The PSP only loads the "EBOOT.BIN" file in order to start the game. The next step was fairly obvious:
eboot_entry = iso.root["PSP_GAME"]["SYSDIR"].entry("EBOOT.BIN")
boot_entry = iso.root["PSP_GAME"]["SYSDIR"].entry("BOOT.BIN")

eboot_entry.extent_lba = boot_entry.extent_lba
eboot_entry.data_length = boot_entry.data_length

stream.pos = eboot_entry.position
eboot_entry.dump(stream)
The nice thing is, when the PSP OS loads files, such as the main executable it has to load in order to start a game, it actually uses the ISO header. I guess my library was pretty useful after all. And thanks to the magic of custom firmware, my PSP has absolutely no issues loading an eboot file that isn't actually encrypted.

Now all I had to do was figure out how the game was loading the voice files. Simple, right? Actually, yes, yes it was. Remember when I said strings are pay-dirt? The first thing I did after loading the file up in my disassembler was to look over the list of strings. Most things on the list seemed pretty useless, but two entries stuck out:
"vag/%s.vag"
"vag_jp/%s.vag"
Anyone familiar with the printf() function could tell you that it's highly probable those strings were at some point being used by the game to load files in the voice acting folders. Could it really be that simple? Could I just switch the string referring to the English folder with the string for the Japanese folder, and be done with the whole mess without familiarizing myself with MIPS assembly language? I could, except changing the entries is not that simple. If the string for the English folder was bigger than the one for the Japanese folder, there would be no problem. I could just overwrite one string with the other and put a terminating null at the end. But since the string I needed to replace was smaller than the string I wanted to replace it with, I had a problem. Here is what things looked like in the Hex editor:
vag/%s.vag..vag_
jp/%s.vag.......
(Note that the "."s represent null characters in this example, except the ones that follow both instances of "%s" which actually are "."s). If I had simply copied the string for the Japanese folder over the one for the English folder, this is what would have happened.
vag_jp/%s.vag.g_
jp/%s.vag.......
It would probably have worked for changing the English voice acting to Japanese. It's just too sloppy for my taste. If you ever actually did enter in the Japanese voice code, it would try to use the string "g" to load voice files. This would at best cause no voice to play, and at worst crash the game. Thankfully, there was a better way.

By the way, for anyone wondering why I would have had to write over the data that came after the string I was overwriting instead of inserting a few bytes to make room, you need to understand that executable files are, like the Gurumin ISO, very dependent on hard coded locations. If you cause everything to move that comes after the string you're inserting by adding bytes, all of a sudden the executable is looking for those things a few bytes earlier than they actually exist. For various reasons, it's really difficult to patch an executable file to fix this, so you just have to take it as a given that you can't ever insert bytes into the middle of an executable file.

Anyway, if you are using a decent disassembler, odds are it has the capability to take a string and find all the places where it's being used by the program. As it turns out, in this case both strings were being used in 8 different places. Not too difficult to deal with. To make things even easier, the instructions that used the strings were always the same:
la $a1, aVagS_vag #"vag/%s.vag"
la $a1, aVag_jpS_vag #"vag_jp/%s.vag"
What these instructions to do isn't particularly important. What is important is if you can change all the instances of the first instruction to the second instruction, you can make the game always load the Japanese voice files.

If you actually look at the bytes for these instructions in a hex editor, here is what they look like:
la $a1, aVagS_vag #"vag/%s.vag" -> 8C 33 A5 24
la $a1, aVag_jpS_vag #"vag_jp/%s.vag" -> 98 33 A5 24
Hmm... those instructions are pretty similar in hexadecimal form. They're only one byte off actually. All I'd have to do would be to locate those instructions in my hex editor and make a one byte change:
stream.pos = boot_entry.extent_lba * iso.lba_size
buffer = stream.read(boot_entry.data_length)

buffer[473956] = 0x98
buffer[474240] = 0x98
buffer[474600] = 0x98
buffer[474932] = 0x98
buffer[475348] = 0x98
buffer[475700] = 0x98
buffer[476116] = 0x98
buffer[476840] = 0x98

stream.pos = boot_entry.extent_lba * iso.lba_size
stream.write(buffer)
Mission accomplished. I can now boot up Gurumin and hear Japanese voices no matter what I do. Totally worth it.

If you'd like to patch your ISO, go ahead and try my script. I linked to it at the bottom of this post. Be sure to read the Readme.txt file for instructions. And don't be afraid to leave a comment if you have any issues.

Gurumin Undub Patch 1.02