Allowed vocabulary: [a-Z][0-9]~!@#$*()=:/,.;?.
encodeURI to encode the whole url.
Example: encodeURI('http://www.google.ch?hi=haal l') → http://www.google.ch?hi=haal%20l
encodeURIComponent to encode a part of the url. Use this if you think for instance that the value
of parameter could contain a '&'
Example: encodeURIComponent('http://www.google.ch?hi=haal l') → http%3A%2F%2Fwww.google.ch%3Fhi%3Dhaal%20
Link about encoding in javascript: http://xkr.us/articles/javascript/encode-compare/
Allowed vocabulary: [a-Z][0-9]/
Frequently used to encode mails because only ASCII-signs are allowed in mails (SMTP specification).
How it works:
Transform a 8-bit datastream into a 6-bit datastream. For example:
123 (decimal) → 00110001 00110010 00110011 (binary ASCII) → 001100 010011 001000 110011 (binary base64) → MTIz
Drawbacks:
Only in Mozilla-based browsers:
btoa('123') = MTIz
atob('MTIz') = 123
allowed Vocabulary: [a-Z],[0-9],'()-,.:/?
How it works:
All signs in the allowed vocabulary are encoded the same way as in the ASCII-Table. For all
other signs you have to take the hexadecimal index of the Unicode-Table and transform it like base64. Example:
£ → 136 (decimal) → 00A3 (hexadecimal) → 00000000 101000011 (binary) → 000000 0010100 001100 (binary Base64) → +AKM- (+ and - are delimeters).
Frequently used to encode mails in the Internet Message Access Protocol (IMAP).
Pros:
Contra:
Example:
List: http://www.w3.org/TR/html4/sgml/entities.html
Example: ' (decimal) or " (hexadecimal)
List: http://www.utf8-zeichentabelle.de/
\x22 (hexadecimal) → \42 (octal) → 0010 0010 → ”
Example:
<script>
Remove dangerous content from data. Is not really secure, see the examples
It is better to insert a placeholder at the removed string. Example:
http://www.milw0rm.com → Links with Security Exploits of all kind of products