Chilkat
HOME
Android™
ASP
Visual Basic
VB.NET
C#
iOS (IPhone)
Objective-C
C++
C
MFC
Delphi
FoxPro
Java
Perl
PHP Extension
PHP ActiveX
Python
PowerShell
Ruby
SQL Server
VBScript
|
Convert HTML Files to XML
This example demonstrates how to convert HTML files to well-formed XML files. The purpose of
converting HTML is to provide well-formed XML that can be parsed. By leveraging existing XML tools and libraries,
it is easy to "scrape" HTML pages to extract information. The Chilkat HTML-to-XML converter automatically fixes most HTML
errors. In any case, well-formed XML is always output.
The input/output files used in this example are provided here, along details explaining rules for conversion.
// Chilkat Java HTML-to-XML Example Program import com.chilkatsoft.CkHtmlToXml; import com.chilkatsoft.CkByteData; import com.chilkatsoft.CkString; public class HtmlToXml { static { try { System.loadLibrary("chilkat"); } catch (UnsatisfiedLinkError e) { System.err.println("Native code library failed to load.\n" + e); System.exit(1); } } // Convert a .html file to a well-formed .xml public static void main(String argv[]) { CkHtmlToXml htmlConv = new CkHtmlToXml(); htmlConv.UnlockComponent("anything for 30-day trial"); htmlConv.ConvertFile("exampleData/test1.html","output/test1.xml"); htmlConv.ConvertFile("exampleData/test2.html","output/test2.xml"); htmlConv.ConvertFile("exampleData/test3.html","output/test3.xml"); } } |
© 2000-2010 Chilkat Software, Inc. All Rights Reserved.