Abstract: This report describes a simple technique which can be used with some search engines to bypass censorware bans on searching for forbidden words. Particular emphasis is placed on the situation of the Great Firewall Of China.
The reasons for search engines to be regarded with suspicion should be evident. They may point a way to sites with prohibited content, which have not yet been put on the censorware blacklist. Some, such as Google, have a cache of web pages, which represents a loophole in censorware control.
When total search-engine bans are not feasible (in a political, not technological, sense), the control may be reduced to denying the ability to seek out information based on certain terms. For example, in China, trying to do a search for the word "falun" (from the forbidden "Falun Gong") may be banned See the discussion in Edelman and Zittrain, Empirical Analysis of Internet Filtering in China or Amnesty International , State Control Of The Internet In China .
This report shows a simple technique to bypass such searching prohibitions, using an undocumented ability of certain search engines (unfortunately, not Google).
However, there is another method for sending such data to a server, known as "POST". Data sent by this method is not in the URL, but transmitted via another channel (called "standard input"). This channel is the means by which documents are normally uploaded and downloaded, so is typically not checked by censorware (though there are exceptions).
It turns out that though many search engine entry forms send their search data via the "GET" method, they can easily be converted to send that data via the "POST" method.
The procedure to do this conversion is a simple modification of the HTML used. Although it requires a little familiarity with HTML editing, it is straightforward:
1) Go to the search engine page, say http://www.alltheweb.com/advanced
2) Save the page, as HTML, to a file. Edit this file as follows.
3) Look for an HTML tag "<head>". On the line after this tag, add a new line starting <base href=", then containing the URL of the search engine page, then finish with ">, here
4) Look for an HTML tag which starts with the characters
a) If there is a
string on that line, change it to
b) If there is no
string, add a
string right after the
portion. So the result would start out:
Load this new file. Any searches typed into this form should now be sent as text data, and bypass the prohibitions of many censorware programs.
This change in method doesn't carry through to any search results screens. That is, once the results are returned, clicking that page for the next screen of results would still use the "GET" method, and so run afoul of the censorware search prohibitions. That second results screen would have to be saved and edited again per the procedure above. But if the number of search results is set high, there should be little need for such repetition.
In addition, Chinese users might find it a good idea to turn-off automatic image loading in their browsers. Sometimes "in-line images" for advertisements send back data in their image URLs, which can activate keyword-based prohibitions. Turning off image loading is typically a browsers Preferences menu option. In the browsers Mozilla, it's under "Privacy & Security" or "Advanced" then "Images". The "Image Acceptance Policy" setting should be at least "Accept images that come from the originating server only", possibly "Do not load any images".
(if you subscribed a few months ago, please resubscribe due to a crash)
See more of Seth Finkelstein 's Censorware Investigations