Java Examples

ChilkatHOMEASPVisual BasicVB.NETC#Visual C++CMFCDelphiFoxProJavaPerlPHPPythonRubySQL ServerVBScript



Java Examples

Quick Start
Java Unicode
Java Certs
Java Email
Java Encryption
Java FTP
HTML-to-XML
Java HTTP
Java IMAP
Java MHT
Java MIME
Java RSA
Java S/MIME
Java Signatures
Java Socket
Java Spider
Java Tar
Java Upload
Java XML
Java XMP
Java Zip

More Examples...
Email Object
POP3
SMTP
RSS
Atom
String
Byte Array
Self-Extractor
Service
PPMD
Deflate
DH Key Exchange
DSA
SSH Key
SSH
SSH Tunnel
SFTP

Unreleased...
Bzip2
LZW
Bz2
Icon

 

 

 

 

 

 

 

Convert HTML Files to XML

This example demonstrates how to convert HTML files to well-formed XML files. The purpose of converting HTML is to provide well-formed XML that can be parsed. By leveraging existing XML tools and libraries, it is easy to "scrape" HTML pages to extract information. The Chilkat HTML-to-XML converter automatically fixes most HTML errors. In any case, well-formed XML is always output.

The input/output files used in this example are provided here, along details explaining rules for conversion.

Download Java Programming Examples

// Chilkat Java HTML-to-XML Example Program
	
import com.chilkatsoft.CkHtmlToXml;
import com.chilkatsoft.CkByteData;
import com.chilkatsoft.CkString;

public class HtmlToXml {
	
  static {
    try {
        System.loadLibrary("chilkat");
    } catch (UnsatisfiedLinkError e) {
      System.err.println("Native code library failed to load.\n" + e);
      System.exit(1);
    }
  }
  
// Convert a .html file to a well-formed .xml

  public static void main(String argv[]) 
  {
    CkHtmlToXml htmlConv = new CkHtmlToXml();
    htmlConv.UnlockComponent("anything for 30-day trial");
    
	htmlConv.ConvertFile("exampleData/test1.html","output/test1.xml");
	htmlConv.ConvertFile("exampleData/test2.html","output/test2.xml");
	htmlConv.ConvertFile("exampleData/test3.html","output/test3.xml");
	
	
  }
}




 

Need a specific example? Send a request to support@chilkatsoft.com

© 2000-2008 Chilkat Software, Inc. All Rights Reserved.