网站公告
点击问题反馈。微信登陆的用户请及时在个人中心设置登陆密码,并且牢记自己的用户名。
头像上传问题点击此处
google了许久,终于找到一个办法可以将有声pdf转成mp3。 很遗憾,这个方法在linux下才能用,我考虑有时间时会重新写一个可以在windows下运行的.

原贴在这里:
http://yhager.com/content/extracting-embedded-audio-pdf

Extracting embedded audio from a PDF
Submitted by yhager on Mon, 12/13/2010 - 19:30
I recently purchased a PDF with embedded sound files from Piano For All The PDF also contains videos, but they were provided as separate files.
For the life of me, I was unable to find an application that can play those media files from the PDF. I tried Adobe Reader, which just sends me to a web page with a “no plug-ins to your OS exist”. Then I tried Okular, my favourite PDF reader, and it too didn’t work. Same results with evince, xpdf and gv.
I also tried to convert the PDF to PS, but that just ignored the embedded audio files, so it didn’t help either.
So I turned to find a solution myself. Analysing the PDF structure, I was able to find the stream objects and their “file names” within the PDF rather easily. I quickly wrote a C program that allowed me to extract those and save them in the file system as files (attached here).
The problem with that was that the files within the PDF were compressed in some way I was unable to detect.
Looking a bit for PDF structure and compression algorithms, and the PDF file itself (I just opened it with a text editor), I found that the compression used is called “FlateDecode”. I then found a Ghostscript utility that can extract those decoded streams and replace them with their uncompressed version.
To use it, save it in the same directory as your PDF file and run:
$ gs -- pdfinflt.ps original.pdf uncompressed.pdfAssuming your original PDF file is named original.pdf, you now have the uncompressed media files embedded within uncompressed.pdf.
Now all I need is to run the stream extraction code again:
$ ./pdf_extract_embedded uncompressed.pdfThe files will be written to the current directory, maintaining the name the original PDF author gave to them when she embedded them in the PDF.
You can find pdfinflt.ps online in the Ghostscript distribution, but I attached it here, for convenience.
The C code for extracting the streams is also attached. Not the best pieces of software around if you want to learn how to code, but it does the work for me. You will need to compile it yourself though. Use something like
$ gcc -o pdf_extract_embedded pdf_extract_embedded






该贴已经同步到 欢乐chylli的微博

评分查看全部评分

参与人数 2 贝壳 +70 理由 收起
jsjyh + 20 这个值得一试
sunflower + 50 等你的程序
2364 查看 3 收藏帖子 (2)

说说我的看法高级模式

您需要登录后才可以回帖 登录|新会员加入

  • 快乐秘诀

    2013-2-1 21:33:12 使用道具

    可惜我用不上,不过也辛苦了
  • happy-yulu

    2013-2-3 22:24:03 使用道具

    看不懂啊,还要学习
  • lanxue

    2013-2-3 23:46:19 使用道具

    直接看不明白~~