Chilkat HOME ASP Visual Basic VB.NET C# Visual C++ C MFC Delphi FoxPro Java Perl PHP Python Ruby SQL Server VBScript
Convert HTML to well-formed XML
The Chilkat Python HTML-to-XML conversion component will convert any HTML to well-formed XML. After converting, existing XML parsers, tools, and libraries can be leveraged to extract (scrape) information. The input/output files used in this example are provided here, along details explaining rules for conversion. # file: htmlToXml.py import chilkat # Python script to convert HTML files to well-formed XML htmlConv = chilkat.CkHtmlToXml() success = htmlConv.UnlockComponent("anything for 30-day trial") if not success: print "component is locked!" sys.exit(0) htmlConv.ConvertFile("exampleData/test1.html","output/test1.xml") htmlConv.ConvertFile("exampleData/test2.html","output/test2.xml") htmlConv.ConvertFile("exampleData/test3.html","output/test3.xml") |
Need a specific example? Send a request to support@chilkatsoft.com
© 2000-2007 Chilkat Software, Inc. All Rights Reserved.