Logo

New! - Try a Free Damaged/Corrupt MS Office
2007 and Open Office Text Extractor Web Service
Google Groups
Data Recovery Freeware
Visit this group

Corrupt File Deleted File Failing Disk Password Recovery
 
Home Sitemap Links Help/About/Ratings

Navigation

Home

Text Extractors

If nothing else you might be able to recover your unformatted text with one of these.

New free web service for extracting the text from corrupt Word files. If the web service, Damaged docx2txt below doesn't work, you have Word 97-2003 files or you need format recovery try demo for the commercial WordFix.


Name - Damaged DOCX Text Extractor

Ratings Explanation

Change These Ratings

Screenshot

2 out of 5 - Predicted quality and likelihood of recovery.2 out of 5 - Predicted speed of data recovery if successful.

Download URL - hosted-freeware/dd2txt-0.52.zip

Developer - Paul Pruitt

OS - Win 2000 -Vista

File Size - Word 2007 files are actually zip files made up of many XML files.  All of the text is found in one file, the word/document.xml file. Some basic formatting information is found in another file.  Extracting these two files with a command line zip program CakeCMD using a slightly modified Perl script called docx2txt, extracts the text.  I simply put a GUI wrapper around it.

Damaged docx2txt requires .Net Version 2.

Comment - This program simply extracts text from corrupt docx files where Word 2007 fails.  The program is a bit unstable but still usable at the moment.


Name - catdoc

Ratings Explanation

Change These Ratings

Screenshot of software in action.

2 out of 5 - Predicted quality and likelihood of recovery. 2 out of 5 - Predicted speed of data recovery if successful.

Download URL - catdoc-0.94.zip

Developer - Vitus Wagner

OS - DOS. Also "Unix. Catdoc was initially developed for Linux and SPARC Solaris. It also runs on variety of other Unices. For instance it is included in FreeBSD ports collection."

File Size - 345 KB

Supported Software Versions or File Systems - All versions of Word?

Developer Provided Description - "catdoc is program which reads one or more Microsoft word files and outputs text, contained inside them to standard output. Therefore it does same work for .doc files, as UNIX cat command for plain ASCII files."

From the readme:

"Since 0.93.0 catdoc parses OLE structure and extracts Word Document stream, but doesn't parse internal structure of it.

This rough approach inevitable results in some garbage in output file, especially near the end of file and if file contains embedded OLE objects, such as pictures or equations."

Comment - If you can't open a Word file this might let you view it.  To make this work, you might have to make sure the name of your file is 8 characters or less. To copy the output you can click on ton the little C:\ Icon in the upper left of the DOS Window(Screenshot of the Windows Menu icon of the Command Applet in Windows.) and choosing Edit -> Mark and then hitting the Enter Key.

Screenshot shjowing how to mark text in the Command Window.

 Screenshot of how to copy text in the Command Window.


Name - DocMorph and MyMorph

Ratings Explanation

Change These Ratings 

 Screenshot of the program in action.

3 out of 5 - Predicted quality and likelihood of recovery. 5 out of 5 - Predicted speed of data recovery if successful.

Download URL - http://docmorph.nlm.nih.gov/docmorph/

Developer - National Library of Medicine

OS - Any system with a Web browser

File Size - NA/Online Service

Supported Software Versions or File Systems - WordPerfect documents with a DOC or a WPD extension

Developer Provided Description - "The U.S. National Library of Medicine's (NLM) document conversion tools make the exchange and use of biomedical library electronic information easier for librarians, library users, and the general public. The DocMorph Web site and MyMorph software are two free conversion tools that allow users to convert more than 50 types of files into alternative, usable formats. The DocMorph Web site allows users to convert files into PDF, TIFF, text, and synthesized speech. The downloadable MyMorph software allows users to mass migrate files to PDF only."

Comment - The idea here would be to extract text from a corrupt WordPerfect file (with a DOC or a WPD extension) or turn it into a PDF and thus recover the data.


Name - BinText

Ratings Explanation

Change These Ratings 

Screenshot of the software in action.

3 out of 5 - Predicted quality and likelihood of recovery. 4 out of 5 - Predicted speed of data recovery if successful.

Download URL - BinText

Developer -Foundstone Inc

OS - Windows 9X/Me/NT/2000/XP

File Size - 116 KB

Supported Software Versions or File Systems - FAT12/16/32/NTFS

Developer Provided Description - "A small, very fast and powerful text extractor that will be of particular interest to programmers. It can extract text from any kind of file and includes the ability to find plain ASCII text, Unicode (double byte ANSI) text and Resource strings, providing useful information for each item in the optional "advanced" view mode. Its comprehensive filtering helps prevent unwanted text being listed. The gathered list can be searched and saved to a separate file as either a plain text file or in informative tabular format."

Comment - This allows you to recover the text from a file if nothing else.


Name - TextExtract

Ratings Explanation

Change These Ratings 

 Screenshot of the software in action.

3 out of 5 - Predicted quality and likelihood of recovery. 4 out of 5 - Predicted speed of data recovery if successful.

Download URL - Textext

Developer -   Ultima Thule

OS - Windows 9X/Me/NT/2000/XP

File Size - 670Kb  

Supported Software Versions or File Systems -  FAT12/16/32/NTFS

Developer Provided Description - "TextExtract is a small but powerful program that scans one or more files for text strings, extracts them and saves them into a separate file. Useful if, say, you have a corrupted word processor file, or if you just want to see what text, if any, is inside a file. Configurable extraction. Lovely interface. Very straightforward to use. TextExtract is free to use from its GUI (i.e. the buttons and that) and shareware if you use it from its command-line interface."

Comment - This allows you to recover the text from a file if nothing else.


Name - ReadText (rt101.zip)

Ratings Explanation

Change These Ratings 

 Screenshot of the software in action.

3 out of 5 - Predicted quality and likelihood of recovery. 4 out of 5 - Predicted speed of data recovery if successful.

Download URL - rt101.zip

Developer - Runar Skaret 

OS - Windows

File Size - 6.1 KB

Supported Software Versions or File Systems - All Windows File Systems

Developer Provided Description - "ReadText uses a routine to distinguish text from binary code. When given a filename, it removes whatever it finds that looks like binary code in the file. What's left behind is what the program believes to be text only. This program is meant to be used on program or data files, where you think there might be some text information hidden, like game cheats, program usage information, etc. The text in such binary files can already be read by any good text viewer, but all the binary code cluttering up can make it difficult to find what you're looking for. Using this program should, by removing most of this code, make your search a little easier."

Comment - This allows you to recover the text from a file if nothing else.

Access Fix Icon
WordFix Banner
ExcelFix Banner
OutlookFix Square Banner