URL Functions#
URL functions in Spectrum allow you to work with URLs, you can easily find the host, a path, or the protocol used in URLs.
Functions#
The following are the URL functions in Spectrum:
URL_AUTHORITY#
Syntax#
URL_AUTHORITY(<URL string>)
Description#
Gets the authority part of the URL provided as string.
Examples
Column1 | URL_AUTHORITY returns |
---|---|
=URL_AUTHORITY("http://user:[email protected]:80/path/file.ext?query=string#fragment") | user:[email protected]:80 |
=URL_AUTHORITY("http://admin:[email protected]:70/path/file.ext?query=string#fragment") | admin:[email protected]:70 |
=URL_AUTHORITY("http://robert:[email protected]:60/path/file.ext?query=string#fragment") returns "user:[email protected]:80" | robert:[email protected]:60 |
URL_DEFAULT_PORT#
Syntax#
URL_DEFAULT_PORT(<URL string>)
Description#
Gets the default port number of the protocol associated with this URL.
Examples
Column1 | URL_DEFAULT_PORT returns |
---|---|
http://www.datameer.com | 80 |
ftp://ftp.datameer.com | 21 |
gopher://gopher.datameer.com | 70 |
https://www.datameer.com | 443 |
file:///file.txt | -1 |
URL_FILE#
Syntax#
URL_FILE(<URL string>)
Description#
Gets the file name of the URL provided as string.
Examples
Column1 | URL_FILE returns |
---|---|
file:///path/index | index |
http://www.datameer.com/index | index |
http://user:[email protected]:80/resource | resource |
http://user:[email protected]:80/ | " " |
http://user:[email protected]:80 | " " |
URL_HOST#
Syntax#
URL_HOST(<URL string>)
Description#
Gets the host name of the URL provided as string.
Examples
Column1 | URL_HOST returns |
---|---|
http://www.datameer.com/index | www.datameer.com |
http://ddate.api.cisco211.underground211.de/?format=html | ddate.api.cisco211.underground211.de |
URL_PARAM#
Syntax#
URL_PARAM(<URL string>;<string>)
Description#
From the given URL gets the value indicated by the second parameter and the result is a string.
Examples
Column1 | Column2 | URL_PARAM returns |
---|---|---|
http://portal.com/site/insidege/fullstory?id_a=2648385&id_b=9988 | id_b | 9988 |
http://portal.com/site/insidege/fullstory?id_a=2648385&id_b=9988&id_c=181818 | id_c | 181818 |
URL_PARAMS#
Syntax#
URL_PARAMS(<URL string>;<string>)
Description#
From the given URL gets the values indicated by the second parameter and the result is a list.
Examples
Column1 | Column2 | URL_PARAMS returns |
---|---|---|
http://portal/site/insidege/fullstory?id_a=2648385&id_a=9988&id_a=181818 | id_a | [2648385, 9988, 181818] |
http://portal/search?q=datameer&q=trial | q | [datameer, trial] |
URL_PATH#
Syntax#
URL_PATH(<URL string>)
Description#
Gets the path part of the URL provided as string according to the Java URL API.
Examples
Column1 | URL_PATH returns |
---|---|
http://www.datameer.com/info/index | /info/index |
https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=MOD+function | /webhp |
URL_PLD#
Syntax#
URL_PLD(<URL string>)
Description#
Extract the PLD (paid-level domain, as per the IRLbot paper) from a URL.
Examples
Column1 | URL_PLD returns |
---|---|
http://www.tools.google.de | google.de |
http://cisco211.underground211.de | underground211.de |
URL_PORT#
Syntax#
URL_PORT(<URL string>)
Description#
Gets the port number of the URL provided as string.
Examples
Column1 | URL_PORT returns |
---|---|
http://www.mydomain.com:80/path/ | 80 |
http://www.myexample.com:8080/index.jspx | 8080 |
http://www.google.de/ig?hl=de | -1 |
URL_PROTOCOL#
Syntax#
URL_PROTOCOL(<URL string>)
Description#
Gets the protocol name of the URL provided as string.
Examples
Column1 | URL_PROTOCOL returns |
---|---|
http:www.mydomain.com | http |
https:www.datameer.com | https |
URL_QUERY#
Syntax#
URL_QUERY(<URL string>)
Description#
Gets the query part of the URL provided as string.
Examples
Column1 | URL_QUERY returns |
---|---|
http://ddate.cisco211.de/api.php?fg=fff&bg=0&ty=fg | fg=fff&bg=0&ty=fg |
http://cisco211.de/?root:sub1:sub2 | root:sub1:sub2 |
URL_REF#
Syntax#
URL_REF(<URL string>)
Description#
Gets the anchor (reference) of the URL provided as string.
Examples
Column1 | URL_REF returns |
---|---|
http://google.de/#section3 | section3 |
http://www.example.com/folder/page.html#cab55 | cab55 |
http://www.datameer.com/#newfeatures | newfeatures |
URL_TLD#
Syntax#
URL_TLD(<URL string>)
Description#
Gets the top-level domain (TLD) of an URL.
Examples
Column1 | URL_TLD returns |
---|---|
http://google.de/index.py?ig&hl=jp | de |
http://www.braufest.by | by |
URL_USERINFO#
Syntax#
URL_USERINFO(<URL string>)
Description#
Gets the user info part of the URL provided as string.
Examples
Column1 | URL_USERINFO returns |
---|---|
http://user:[email protected]:80/path/file.ext?query=string#fragment | user:password |
http://myaccess:[email protected]/index.shtml | myaccess:mysecret |
URL_DECODE#
Syntax#
URL_DECODE(<URL string>)
Description#
Decodes an encoded string.
It is assumed that all characters in the encoded string are one of the following: "a
" through "z
", "A
" through "Z
", "0
" through "9
", "-
", "_
", ".
", and "*
". The character "%
" is allowed but is interpreted as the start of a special escaped sequence.
The default for <character_encoding:string> is UTF-8.
Examples
Column1 | URL_DECODE returns |
---|---|
http%3A%2F%2Fgoogle.com%2F | http://google.com/ |
http%3A%2F%2Fwww.datameer.com%2Fproduct%2Findex.html | http://www.datameer.com/product/index.html |
http%3A%2F%2Fwww.datameer.com%2FDatameer-trial.html | http://www.datameer.com/Datameer-trial.html |
URL_ENCODE#
Syntax#
URL_ENCODE(<URL string>)
Description#
Encodes a string. The default for <character_encoding:string> is UTF-8.
When encoding a string, the following rules apply:
- The alphanumeric characters "
a
" through "z
", "A
" through "Z
" and "0
" through "9
" remain the same. - The special characters "
.
", "-
", "*
", and "_
" remain the same. - The space character "
" is converted into a plus sign "
+
". - All other characters are unsafe and are first converted into one or more bytes using some encoding scheme. Then each byte is represented by the 3-character string "
%xy
", where xy is the two-digit hexadecimal representation of the byte.
Examples
Column1 | URL_ENCODE returns |
---|---|
http://google.com/ | http%3A%2F%2Fgoogle.com%2F |
http://www.datameer.com/product/index | http%3A%2F%2Fwww.datameer.com%2Fproduct%2Findex |
http://www.datameer.com/Datameer-trial.html | http%3A%2F%2Fwww.datameer.com%2FDatameer-trial.html |