Xojo Plugin
Xojo Plugin
GetBaseDomain
See more Spider Examples
The GetBaseDomain method is a utility function that converts a domain into a "domain base", which is useful for grouping URLs. For example: abc.chilkatsoft.com, xyz.chilkatsoft.com, and blog.chilkatsoft.com all have the same base domain: chilkatsoft.com. Things get more complicated when considering country domains (.au, .uk, .se, .cn, etc.) and government, state, and .us domains. Also, domains such as blogspot, wordpress, etc, are treated specially so that "xyz.blogspot.com" has a base domain of "xyz.blogspot.com". Note: If you find other domains that should be treated similarly to blogspot.com, send a request to support@chilkatsoft.com.Chilkat Xojo Plugin Downloads
Dim spider As New Chilkat.Spider
System.DebugLog(spider.GetBaseDomain("www.chilkatsoft.com"))
System.DebugLog(spider.GetBaseDomain("blog.chilkatsoft.com"))
System.DebugLog(spider.GetBaseDomain("www.news.com.au"))
System.DebugLog(spider.GetBaseDomain("blogs.bbc.co.uk"))
System.DebugLog(spider.GetBaseDomain("xyz.blogspot.com"))
System.DebugLog(spider.GetBaseDomain("www.heaids.org.za"))
System.DebugLog(spider.GetBaseDomain("www.hec.gov.pk"))
System.DebugLog(spider.GetBaseDomain("www.e-mrs.org"))
System.DebugLog(spider.GetBaseDomain("cra.curtin.edu.au"))
// Prints:
// chilkatsoft.com
// chilkatsoft.com
// news.com.au
// bbc.co.uk
// xyz.blogspot.com
// heaids.org.za
// hec.gov.pk
// e-mrs.org
// curtin.edu.a