To extract text from two PDF files (one in each language) so that each page in the PDF is a separate file (well, two files, because there are two languages).  This makes it easier to align the files for creating a TM, because if any page is mangled, it mangles only that pair of files.
