Chilkat
HOME
Android™
ASP
Visual Basic
VB.NET
C#
iOS (IPhone)
Objective-C
C++
C
MFC
Delphi
FoxPro
Java
Perl
PHP Extension
PHP ActiveX
Python
PowerShell
Ruby
SQL Server
VBScript
Convert HTML to well-formed XML
The Chilkat Perl HTML-to-XML conversion component will convert any HTML to well-formed XML. After converting, existing XML parsers, tools, and libraries can be leveraged to extract (scrape) information. The input/output files used in this example are provided here, along details explaining rules for conversion.
# file: htmlToXml.pl use chilkat; # Perl script to convert HTML files to well-formed XML $htmlConv = new chilkat::CkHtmlToXml(); $success = $htmlConv->UnlockComponent("anything for 30-day trial"); if (! $success) { print "component is locked!\n"; exit; } $htmlConv->ConvertFile("exampleData/test1.html","output/test1.xml"); $htmlConv->ConvertFile("exampleData/test2.html","output/test2.xml"); $htmlConv->ConvertFile("exampleData/test3.html","output/test3.xml"); |
© 2000-2010 Chilkat Software, Inc. All Rights Reserved.