read file .docx or .doc in php?

4/24/2010 10:14:57 AM
comp.lang.php 32648 articles. 0 followers. Post Follow

4 Replies

Similar Articles

[PageSpeed] 4
Amit Prakash Pawar wrote:
> How to read clean data from .docx or .doc file?
> $file_url="";
> $doc_data = file_get_contents($file_url);
> it's gives out put like
> �T�a�b�l�e� �N�o�r�m�a�l����ö��4Ö� l�4Ö���aö������(�k
> ôÿÁ�(�� ������0�N�o� �L�i�s�t���������PK�����!
> �‚Š¼ú��������[Content_Types].xml
> how to get clean data like text file??

That's because it's not an ASCII file.  It's a binary format used by 
Microsoft Word.

If you're on a Windows server, you can use a COM object to open the 
file.  Otherwise, you'll have to find a PHP library which can read the 
document.  Goog luck - I haven't found one yet which doesn't have problems.

Your best bet is to just have Word save the file as a plain text.

Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
4/24/2010 10:35:55 AM
In article 
 Amit Prakash Pawar <> wrote:

> How to read clean data from .docx or .doc file?
> $file_url="";
> $doc_data = file_get_contents($file_url);
> it's gives out put like
> ?T?a?b?l?e? ?N?o?r?m?a?l????�??4�? l?4�???a�??????(?k
> ���?(?? ??????0?N?o? ?L?i?s?t?????????PK?????!
> ?��1�4�????????[Content_Types].xml
> how to get clean data like text file??

This would require you knowing and decoding Word's proprietary format.  
Unless you got Microsoft's internal documentation off the net or reverse 
engineered the format, that's not going to happen.  Use Word to output 
the file in plain text.  Then you'll be able to read it.

DeeDee, don't press that button!  DeeDee!  NO!  Dee...
[I filter all Goggle Groups posts, so any reply may be automatically ignored]

4/24/2010 4:58:33 PM
Op 24-4-2010 12:35, Jerry Stuckle schreef:
> Amit Prakash Pawar wrote:
>> How to read clean data from .docx or .doc file?
>> $file_url="";

sorry, we couldn't find that page

is what they are responding at

4/24/2010 5:02:52 PM
Luuk wrote:
> Op 24-4-2010 12:35, Jerry Stuckle schreef:
>> Amit Prakash Pawar wrote:
>>> How to read clean data from .docx or .doc file?
>>> $file_url="";
> sorry, we couldn't find that page
> is what they are responding at

That's because the op doesn't know enough to use  He uses a 
real domain, instead.

Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
4/24/2010 5:49:56 PM
Similar artilces about - read file .docx or .doc in php?:

Re: Can I read data using Macro? #3
Summary: Techniques for reading multiple files #iw-value=1 Ekta, Although macro can be used, there are probably better tools for what you have so far described. Take a look at trying something like filename q ("c:\junk\test\*msg.txt" "c:\junk\prod\my*.txt") ; If you need to know the where the data is coming from use the FILENAME option of the INFILE statement. If you need more control than an * provides then look at the FILEVAR option of the INFILE statement. If you still need macro thean give example problem not solved by the above. Ian Whitlock ================...

Algorith for DICOMDIR file reading
Dear friends What is the good algorithm for reading DICOMDIR files. I want the output as a tree structure. Thanks in advance. Alvin Dear friends, I got the solution. And I am closing this topic. Thanks and Regards Alvin ...

Looking for a Python Program/Tool That Will Add Line Numbers to a txt File
See Subject. It's a simple txt file, each line is a Python stmt, but I need up to four digits added to each line with a space between the number field and the text. Perhaps someone has already done this or there's a source on the web for it. I'm not yet into files with Python. A sudden need has burst upon me. I'm using Win XP. -- Wayne Watson (Nevada City, CA) Web Page: <> On Feb 14, 8:54 am, "W. Watson" <> wrote: > See Subject. It's a simple txt fil...

Reading text files.
I am looking for the easiest way possible to read a text file into SystemVerilog. First, a little background: I have six years of experience with VHDL, which I used to use to make command-driven testbenches. I could write testbenches that looked like this: RESET 10 IDLE 12 WRITE ABCD1234 DEADBEEF IDLE 6 READ 1234ABCD FEEDFEED etc. etc. etc. I would now like to do the same thing in Verilog and/or SystemVerilog, but cannot find an easy way to read and parse a simple text file. So far, the ONLY thing that I have found is: which ...

what if I only want to read the first line of an email?
I want Access to read the fresh part of an email but not the old replied content. What i mean is I want Access to only read the 'fresh' part of the email, for example 33333333333333 in this case. ----------------------------------------------------------------------------------------------------------------------- 3333333333333333 On Fri, Nov 13, 2009 at 4:11 PM, BBC <> wrote: 2222222222222222 On Fri, Nov 13, 2009 at 4:11 PM, ADD <> wrote: 1111111111111111 ---------------------------------------------------...

===Welcome to comp.lang.c++! Read this first. #23
Welcome to comp.lang.c++! Read this first. This post is intended to give the new reader an introduction to reading and posting in this newsgroup. We respectfully request that you read all the way through this post, as it helps make for a more pleasant and useful group for everyone. First of all, please keep in mind that comp.lang.c++ is a group for discussion of general issues of the C++ programming language, as defined by the ANSI/ISO language standard. If you have a problem that is specific to a particular system or compiler, you are much more likely to get complete and accurate answers in...

Re: Reading from excel #4
If you only need name and address, couldn't you try separating the field by the use of any number, or the word's "suite", "p.o", "appt", "apt", and "appartment" to identify the name? Might not be perfect, and wouldn't separate first and last names, but might provide a good start. Art -------- On Tue, 24 Jul 2007 17:49:27 -0400, Ed Heaton <EdHeaton@WESTAT.COM> wrote: >This brings to mind my adage - "It's easier to put values together than >to separate them!" > >Whoever created your file obviously di...

APC and PHP 5.0.5 on Win32?
Hey all, I'm looking for APC compiled for PHP 5.0.5. It's not included the PECL extension bundle on or anywhere else. There is a PHP 5.1 version but I don't (yet) know if that will work (I'm assuming not). Does anyone have a Win32 compiled version of APC for PHP 5.0.5? Thanks, ...

Read string names in a loop
Hello I have a basic program question. I have some files that are named this way: test.7.2.20120718.101456.s5mubr.dcm test.7.2.20120718.101456.16501lh.dcm test.7.3.20120718.101456.8ll4jr.dcm and so on The last 6 characters before the extension is a string. And that part is always 6 characters. I have read file names with ascending integers before in a loop by using "%04d" or something similar. How can I read all of these file names in a loop? Thanks. - Anita On Wednesday, October 17, 2012 10:00:39 AM UTC+13, Anita S wrote: > Hello > > I have ...

How to read numbers in a text file efficiently
How to read the numbers in a text file in the following form ISSUE K1 K2 K3 "130108074" "4" "3" "2" "130108073" "6" "1" "1" "130108072" "2" "5" "1" "130108071" "2" "6" "5" "130108070" "1" "2" "2" "130108069" "5" "6" "1" "130108068" "3" "2" "1" "130108067" "5" "6" "5" ...

Import Interface File
Hi there, I have a text file (flat file) which I would like to import on a regular basis into Access. The text file contains 2 record types, header (prefixed with RHD) and detail (prefixed with RDT). Each recordtype has a unique structure. ------------------------------------------------------------------------------- For example line one might read: RHD123456ABCDEF Characters 1-3 is record type, characters 4-9 is field X, characters 4-15 is field Y ------------------------------------------------------------------------------- Line 2 onwards are detail records for line 1: RDT999ABC99 Char...

Calculation to know File Cannot Be Found in Container
Hello I'm planning on upgrading to Filemaker 8.5 in the immediate future and I've been working with the trial version but cannot figure this out. Is it possible in a container field which is populated by a calculated url that does not exist to return the result of "file cannot be found" in a text calculation so that it knows to use an alternate url or to return a text result of "missing picture"? Picture1 is a calculation field with the calculation result of a container The calculation is "imagewin:/C:/Documents and Settings/ND/My Documents/My Pictures/"...

Reading data file containing character
Here's a data file ========================================================== NODE{1 0 6 0 1.000000e+002 -7.450605e-011 -7.450599e-011 -4.749746e-011} NODE{2 0 6 0 1.000000e+002 -7.450605e-011 -7.450599e-011 1.000000e-003} NODE{3 0 6 0 1.000000e+002 -7.450605e-011 -7.450599e-011 2.000000e-003} NODE{4 0 6 0 1.000000e+002 -7.450605e-011 5.857869e-004 1.000000e-003} NODE{5 0 6 0 1.000000e+002 -7.450605e-011 1.000000e-003 -4.749746e-011} NODE{6 0 6 0 1.000000e+002 -7.450605e-011 1.000000e-003 2.000000e-003} NODE{7 0 6 0 1.000000e+002 -7.450605e-011 1.514727e-003 1.000000e-003} NODE{8 0 6 0 1.0...

Sybase goes read only
I have an Access97 frontend using linked tables to a sybase database via ODBC DSN. Sometimes but not always the linked table becomes read only. After refreshing the links it may or may not return to being editable. This happens on several tables in different DB's. Yes, there is a primary key on the tables No, exclusive locking is not employed. I cannot find either on Microsoft or sybase any refernece to this issue. My work around is to code data updates thru passthru queries, but clearly takes more coding. I checked to make sure we have the latest ODBC.dll Has anyone experienced the ...

Finding the current position in a file
Hello Group, How do you find the current position in a file? I know about lseek to set the position, but I have as of yet to find a function that will return the current file position. Any ideas other than keeping track of a filepos variable in my program? -- Daniel Rudy Email address has been base64 encoded to reduce spam Decode email address using b64decode or uudecode -m Why geeks like computers: look chat date touch grep make unzip strip view finger mount fcsk more fcsk yes spray umount sleep Daniel Rudy wrote: > Hello Group, > > How do you find the current position in ...

Licensing issues
I have been asked to copy the compiler from a legacy 2.6 system to a solaris 10 zone. This compiler is used in production by us and is a Sun Workshop compiler. Using another compiler is anot an option. Sun as discontinued support for this product and offers a demo license to users. This demo license is perhaps forever. My issue is this: I followed Sun's instructions on installing the demo license keys. The customer says that they can compile without problems, However the license manager logs give errors. The errors stem from the fact that files were copied from the 2.6 system and are lic...

doc improv: backup/restore
--=-02uPHXZe6kEWYD6O3DW4 Content-Type: text/plain Content-Transfer-Encoding: 7bit This patch improves the backup & restore documentation somewhat: I added a few more cross-refs, documented increasing checkpoint_segments, and made a few other minor improvements. I intend to apply this within 24 hours barring any complaints. -Neil --=-02uPHXZe6kEWYD6O3DW4 Content-Disposition: attachment; filename=restore-doc-improv-4.patch Content-Type: text/x-patch; name=restore-doc-improv-4.patch; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 7bit Index: doc/src/sgml/backup.sgml...

help aiuto file .bin
ho scaricato 3 file .bin di autocad 2005... come si scompattano o si ricompongono.... thank you ...

Did you read about that?
you can take a look for this web sites or read this =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D What is Islam? ABOUT THE WORDS "ISLAM" AND "MUSLIM" The name of this religion is Islam, the root of which is S-L-M, which means peace. The word "Salam," derived from the same root, may also ...

open/fopen read/fread in multithreaded environment.
I am having data file, which is only used for reading by multiplethreads in same process. Reading may be done by in following possible way. 1.each Thread fopen,fread,fclose <fread may be called in loop> 2.each Thread open, read,close <fread may be called in loop> 3.Main Thread will fopen, childthread will fread in loop , mainthread will fclose at end. 4.Main Thread will open, childthread will read in loop , mainthread will close at end. <in 3 and 4 above return value of open/fopen will be global value <or may be pass to function but i think it is shared between multiple...

how to avoid "Symbolic link to SVN-controlled source file; follow link? (yes or no)"
Dear all, I'm using gdb within emacs and am tired of getting this question: how to avoid "Symbolic link to SVN-controlled source file; follow link? (yes or no)" I just want it to automatically accept this (= answer yes at all times). How to get this behaviour? Thanks, Martin Martin J�rgensen <> writes: > Dear all, > > I'm using gdb within emacs and am tired of getting this question: > > how to avoid "Symbolic link to SVN-controlled source file; follow > link? (yes or no)" > > I just want it to automatically...

Bizzare error message requests me "Select a log file" for .vi
Hi &nbsp; I am getting a very strange pop-up with my vi that requests me to select a log file for the VI. &nbsp; If I hit&nbsp;cancel I get the pop-up "Cannot log front panel of vi. there is no log file associated with the vi" &nbsp; I don't want a log file.&nbsp; &nbsp; There must be some background setting in the VI that is not in the actually Lab-VIEW code. &nbsp; Does anybody know how I can switch this off? &nbsp; Many thanks Ashley &nbsp; &nbsp; Off the top of my head I can only think of one setting that may cause this.&nbsp; Yo...

Mounting cdrom: file names are short (DOS 8.3 style)
When I mount /cdrom on my RedHat box at work, the file names on the CD are normal (long) file names (eg: install.linux), but when I try it at home on my Debian Woody box, the file names are in the DOS 8.3 format (eg: instal~1.lin). The file content is OK, but the different file names is causing problems. Has anyone seen this before? Greg on your home box as root enter: mount -t iso9660 /dev/[location of drive] /cdrom Sounds like it's trying to use fat or msdos filesystem instead of the unix style. Greg McFarlane wrote: > When I mount /cdrom on my RedHat box at work, the file name...

Re: Read binary data #7
sounds a bit cruel! packed numbers on zOS (eg from COBOL or PL1) are 1 byte longer than "necessary". They have a additional byte for the sign, in hex "0C"x or "0D"x as far as I remember. So a 8-digit number which should fit in 4 bytes will require 5 bytes. If there are some example bytes visible in hex, it might be possible to see what TBC (that's a sickness in Germany... a, ok TBCD!) stores. Maybe they omit the sign, cause a telefonnumber does't need it. The reason why they use BCD numbers might be, that there is space for leading zeros (?) and it's s...

Read XML from web into Notes as Text?
In Notes, I need to read in some XML that can be reached from a URL, and convert it to text. I've been using GetDocumentByURL to access the XML, but the document it returns has the XML attached as a file. So the only way I can find of accessing the file is to ExtractFile it, and store it on the file system. Then I'd have to open the file and read through it. Is there a better way of doing this? Can you read an attachment without having to detach and store it first? Any better approaches or suggestions anyone? Thanks. DAVE!! ...