Chilkat
HOME
Android™
ASP
Visual Basic
VB.NET
C#
iOS (IPhone)
Objective-C
C++
C
MFC
Delphi
FoxPro
Java
Perl
PHP Extension
PHP ActiveX
Python
PowerShell
Ruby
SQL Server
VBScript
Convert HTML files into well-formed XMLDownloads for Windows/Linux and Install Instructions The Chilkat Ruby HTML-to-XML conversion component will convert any HTML to well-formed XML.
After converting, existing XML parsers, tools, and libraries can be leveraged to extract (scrape) information.
The input/output files used in this example are provided here, along details explaining rules for conversion.
# file: htmlToXml.rb require 'rubygems' require 'chilkat' # Ruby script to convert HTML files to well-formed XML htmlConv = Chilkat::CkHtmlToXml.new() success = htmlConv.UnlockComponent("anything for 30-day trial") if not success print "component is locked!" exit end htmlConv.ConvertFile("exampleData/test1.html","output/test1.xml") htmlConv.ConvertFile("exampleData/test2.html","output/test2.xml") htmlConv.ConvertFile("exampleData/test3.html","output/test3.xml") |
© 2000-2010 Chilkat Software, Inc. All Rights Reserved.