Chilkat
HOME
Android™
ASP
Visual Basic
VB.NET
C#
iOS (IPhone)
Objective-C
C++
C
MFC
Delphi
FoxPro
Java
Perl
PHP Extension
PHP ActiveX
Python
PowerShell
Ruby
SQL Server
VBScript
Extract all HTML Objects from a Web PageDemonstrates how to download a Web page (at a URL) and extract all HTML objects. Eg. images, links, CSS files, JavaScript files, etc. Downloads for Windows/Linux and Install Instructions require 'chilkat' mht = Chilkat::CkMht.new() success = mht.UnlockComponent("Anything for 30-day trial") if (success != true) print mht.lastErrorText() + "\n"; exit end # Download a URL into an in-memory MHT web archive contained # in a string variable: mhtDoc = mht.getMHT("http://www.gopackaging.com/") if (mhtDoc == nil ) print mht.lastErrorText() + "\n"; exit end # Now extract the HTML and embedded objects: unpackDir = "/Users/chilkat/temp/" htmlFilename = "gopackaging.html" partsSubdir = "objects" # Extract to /Users/chilkat/temp/gopackaging.html. # images and other embedded objects are placed in # /Users/chilkat/temp/objects. Directories are automatically # created if they don't already exist. success = mht.UnpackMHTString(mhtDoc,unpackDir,htmlFilename,partsSubdir) if (success != true) print mht.lastErrorText() + "\n"; else print "Unpacked!" + "\n" end |
© 2000-2010 Chilkat Software, Inc. All Rights Reserved.