UNITED STATES DISTRICT COURT
SOUTHERN DISTRICT OF NEW YORK

BARBARA NITKE, THE NATIONAL COALITION FOR SEXUAL FREEDOM, and THE NATIONAL COALITION FOR SEXUAL FREEDOM FOUNDATION, Plaintiffs, -against- 01 Civ. 11476 (RMB) JOHN ASHCROFT, ATTORNEY GENERAL OF THE UNITED STATES OF AMERICA, and THE UNITED STATES OF AMERICA; Defendants.
EXPERT REPORT OF SETH FINKELSTEIN November 10, 2003

I. Opinion of Witness with Basis and Reasons Therefore

A provider of content via the Internet cannot reasonably be expected to know the location of readers, if the context is one in which location would lead to a denial of the ability to read the content.

This is because material can be read on the Internet through many alternate geographic routes, where the content can intentionally be relayed through third-party intermediaries which act to mask and obscure location. Further, intrinsic inaccuracies such as changes in address assignment and proxying by such large providers as America Online (AOL) mean many users cannot be reliably located.

II. Data or Other information Supporting the Opinion

Introduction - Internet Architecture - The Fallacy Of "Cyberspace"

Using the Internet is often refered to as being in "cyberspace". But as Lawrence Lessig stated in the conclusion to his book "Code and Other Laws of Cyberspace":

.. code is not elsewhere, and we are not elsewhere when we feel its effects. As Andrew Shapiro puts it: "Seeing cyberspace as elsewhere misconstrue[s] its legal significance. It keep[s] us from seeing the way that regulatory forces like code, which some say are 'there,' are actually affecting us here."

The subject matter of this case is in fact exactly where we are, and the realities of that location.

But thinking of use of the Internet as being in "a place" imports some very misleading aspects of that metaphor, such as an idea that restrictions based on ideas from the framework of physical geography can be simply mapped onto the network geography involved in electronic communication.

An old saying instructs us that "There is nothing new under the sun". And indeed, Internet communication does not possess any mystical aspects. However, a huge scale shift, and quantitative growth in communication ability, creates a qualitative change. The vastly increased ability of people to communicate from all over the world. using multiple paths of connecting to one another, makes certain previous communication constraints impractical to impose.

"Co-operative" vs. "Oppositional" location

There certainly exist many services which attempt to locate users or sites based on their IP addresses. For example, the search engine Google has an experimental web search incorporating location, at
http://labs.google.com/location

Vendors sell location service, and this will be discussed further below. But before launching into details of products, an overarching principle must be understood.

There's situations of what might be termed co-operative geo-location. That is, it is in the interests of the party being located to co-operate with supplying geographic information, in order to to gain some benefit. In sum, they want to be found, and to be found with precision.

The issues of this case involve what might be termed oppositional geo-location. That is, it is AGAINST the interests of the party being located to co-operate with supplying geographic information. Accurate information may serve to deny them what they seek. Thus they have an incentive to make sure they display an answer which benefits them.

This difference, between the "co-operative" vs. "oppositional" kinds of applications, cannot be overstressed. In both cases, extensive products exist to serve the market for the corresponding type. But crucially, these are very different markets. Indeed, to some extent (as perhaps show by this case), the market for oppositional location is generated as a privacy-based reaction where, per above, the subject does not want to engage in co-operative location.

Privacy Protection Services

The primary personal information intermediary is the privacy/anonymity service. These are companies which sell a service which protects the personal details of the user. For example:

http://www.anonymize.net/

By sending a simple inquiry to Internet registries every one curious can find out:
geographic location of your or your company Internet Services Provider (ISP); ...

How we can help. ...

No one will be able to gather any information about you from your IP address. The only data available will be our IP address located in the Bahamas and our domain (contact details only if an inquiry is sent to Internet Registries).

[See exhibit file anonymizenet1.gif, URL http://anonymize.net/]

Or "Anonymizer" (http://www.anonymizer.com/)

Problem ... Companies or people can steal your IP Address to figure out:

Where you live
What your e-mail address is
Get your credit card and other personal information

Solution ... Get an Anti-Tracking privacy solution.

[See exhibit file anonymizercom1.gif, URL http://www.anonymizer.com/snoop/test_ip.shtml]

Or http://www.proxyone.com/

The internet is not always a safe secure place to be. Many people don't realize the security risks involved or the information they may be unknowingly providing to others as they surf the internet. Now you can anonymously surf the internet and protect your privacy.

There are a large number of such services. See exhibit file proxydir1.gif, URL http://directory.google.com/Top/Computers/Internet/Proxies/

And many are even available without cost. See exhibit file proxydir2.gif, URL http://directory.google.com/Top/Computers/Internet/Proxies/Free/

I conducted a series of experiments viewing an IP location service, ip-to-location.com (http://www.ip-to-location.com) through various privacy proxies.

The exhibits below show the results of the various geographic locations which were returned:

iplocation1.gif - The ip-to-location.com service, default behavior, shows my location (correctly) as Boston, Massachusetts.

iplocation2.gif - viewed through babelfish translator (acting as a proxy in effect) shows my location as Saugerties, New York

iplocation3.gif - viewed through proxyone.com shows my location as Saint Petersburg, Florida

iplocation4.gif - viewed through the-cloak.com shows my location as San Francisco, California

iplocation5.gif - viewed through pureprivacy.com shows my location as San Antonio, Texas

iplocation6.gif - viewed through guardster.com shows my location as San Francisco, California

iplocation7.gif - viewed through proxyweb.net shows my location as British Columbia, Vancouver, CANADA

iplocation8.gif - viewed through mdsme.de shows my location as GERMANY

iplocation9.gif - viewed through anonymouse.ws shows my location as Washington, DC

iplocation10.gif - viewed through proxyify.com, shows location as Valley Stream, New York

metaspinner1.gif - example of using metaspinner proxy, located in Germany

These services can be quite easy to use. For example, several are conveniently collected in an "Anonymous Browsing Quick-Start Page" ( http://www.space.net.au/~thomas/quickbrowse.html), with links to other resources. See exhibit files quickbrowse1.gif and quickbrowse2.gif .

To drive home the point, note exhibit file pw-nitke1.gif - Example of viewing barbaranitke.com site through proxyweb.net proxy

The Internet Archive

A critical aspect of Internet communication is the enormous amount of copying and archiving which is possible. Once information is made public, it can be replicated and made available from many other sites ("mirroring"). The person who first made the information available has absolutely no (technical) control over what other site operators do with it. And certainly cannot (technically) control how many might read from those other sites.

One of the best examples of this phenomenon is "The Internet Archive" (http://www.archive.org/), an Internet site which collects copies of other Internet sites. As it describes itself (see exhibit file archive1.gif):

The Internet Archive is building a digital library of Internet sites and other cultural artifacts in digital form. Like a paper library, we provide free access to researchers, historians, scholars, and the general public.

So this is somewhat analogous to the way a physical library in one location might acquire material and lend it to someone in another location, without the writer being aware of the lending. However, here anyone in the world can visit the digital library, from any place in the world.

As shown in exhibit file archive2.gif, parts of Ms. Nitke's website have been archived. While not everything from the http://barbaranitke.com/ website is archived, significant material from the site has been stored. This includes some images, as shown by the display in the exhibit file archive3.gif and the image result shown in exhibit file archive4.gif. Many more images are available in this website archive, without any way for Ms. Nitke to know even if they are being viewed at all.

Intrinsic Inaccuracies Of Geolocation Databases

Perhaps the most candid assessment of the weaknesses of geolocation accuracy comes directly from vendor's discussion of their database limitations. In the vendor's own description of "IP2Location(TM) products", they state at http://www.ip-to-location.com/README-IP-COUNTRY-REGION-CITY.htm#17, and exhibit inaccurate1.gif
(emphasis added in the quote below)

17. How many countries are included in the database? What is the accuracy?

The database has over 95% of accuracy in country and ISP level, 70% in region level and 65% in city level, which is higher than any of our competitors. The country-level inaccuracy is due to dynamic IP address allocation by large ISPs such as AOL and MSN TV. Because AOL uses a network that routes all of the company's Internet traffic through Reston, Virginia. All IP-based geo-location services, including IP2Location, are unable to determine the state and city for people who dial into the AOL network. The region-level and country-level inaccuracy is due to the flexibility given to each ISP to re-assign dynamic IP address within their service area.

An accuracy of "70% in region level and 65% in city level" might be fine for improving the targeting of advertisements or generating a report on customer demographics. These services can thus be useful for businesses, where the worst case is that marketing money is misspent. However, for the extraordinarily demanding context of criminal liability, such an accuracy rate cannot be acceptable.

A similar point was made at length for taxation, in a report issued by the Information Technology Association Of America (http://www.itaa.org/) regarding "Ecommerce Taxation And The Limitations of Geolocation Tools" (available at http://www.itaa.org/taxfinance/docs/geolocationpaper.pdf, included as an exhibit). It concludes:

Geolocation technologies do provide valuable non-tax commercial functionalities (i.e., marketing data, etc.) where a high degree of accuracy regarding a user's jurisdiction is not required at a transaction level. However, given the current inability of such technologies to overcome obstacles presented by corporate networks, anonymizers, AOL users, IPv6, and the other issues discussed above, coupled with their lack of complete certainty as to customer location, they cannot be relied upon for consumption tax purposes.

As the ITAA Geolocation paper press release stated (available at http://www.itaa.org/news/pr/PressRelease.cfm?ReleaseID=-1480108901, included as an exhibit geolocationpaperpr.txt , emphasis added below):

"The EU VAT rules that will go into effect next year place an unmanageable burden on geolocation software products. Our report finds that geolocation software does not resolve any of the concerns about being able to independently identify the correct taxing jurisdiction," said ITAA President Harris N. Miller. "It is inappropriate for governments to mandate the use of these tools for taxing or for any other purposes."

Or, as an article from "Interactive Week" put it (available on the web at http://groups.yahoo.com/group/TYR/message/2265?source=1, included as an exhibit, spangler.txt)

"It's impossible for these guys to be 100 percent accurate," says Peter Christy, a Jupiter Media Metrix analyst. "You can't use this for life-and-death situations."

Costs And Burdens

Perhaps the best way to conclusively demonstrate a significant burden, is to compare the complex problem of determining the obscenity laws of various jurisdictions, with a much simpler problem, determining the sales tax laws of various jurisdictions. What is at issue here is more than mere location. It's location plus laws.

Sales tax is a fairly well-defined item. It's one number, regardless of any type of subject determination or merit. So we have one geographically-determined number, for one page (a sales-transaction page). Taxware (http://www.taxware.com/) is a well-known company which sells products to handle this problem. Their lowest price general solution, "Tax Manager", (http://www.taxware.com/solutions/taxmanager.html), is $695 for a single user. See exhibit file taxmanager1.gif.

That's just for the software itself. Then it must be integrated with the web server. I happen to have personal experience implementing Taxware products in an e-commerce site. It took me about a day of professional programming time, to read their material, write the code, and give it some tests. That resulted in a consulting-agency charge of around $880 for the day (8 hours x $110/hour = $880). Note this figure is consistent with the amounts Taxware charges for training in the use of its product, e.g. $1100 for two days. See exhibit file taxtrain1.gif.

Of course this added to the overhead of the site itself, as it was an additional database which had to have storage allocated to it, be backed-up, and was a risk of failure during the transaction if corrupted or unanticipated data caused a bug.

So even to start, with the most minimal factors, there is a solid estimate of $695 + $880 = $1575 from a much simpler problem. That is again, one page, one number, a lower-bound market-value cost for a single-page, single-value, determination. The issues here involve scaling this up by an order of magnitude in two different directions simultaneously, both in pages and values, with all covered pages having to take into account all values.

We can derive further well-grounded lower bounds by comparing our problem to the costs involved in moving from a single user software license to a multiple user software license. That is, we might consider the known costs of multiple users making the same determination for single page, to be reasonably informative as to the possible costs of having a single user making the same determination for multiple pages. So concretely, instead of the single user version of the Taxware product at $695, we'd need something comparable to the unlimited license version at $1,995 (again exhibit file taxmanager1.gif). And at least another programmer half day of more complex testing on the site, since it now has to cover many pages. So we increase the absolute minimum to $1,995 + $1320 (1.5 programmer-day) = $3315.

This has yet to take into account the increase in complexity from obscenity law versus sales tax law. Since moving from a single to multiple software license at least doubled the cost, it seems reasonable to argue going from a simple to complex determination is at least another doubling: $3315 * 2 = $6630. Again, this is not meant as a funding estimate for a business case. Rather, it's intended as a concrete value to indicate where such requirements are priced-out of the range of personal websites. When factors such as liability insurance are considered, the true costs of such a system could easily be very much higher.

In fact, abstractly, the problem of non-overbroad automated compliance with obscenity law itself seems extremely involved. How would one even specify a programmatic interface between what is shown in an image, and what is prohibited in a community? I'm advised there are between 1,500 and 2,000 communities in the United States. There's a vast difference between the tasks of having a human check an item being mailed, in which case only specific communities and specific items have to be checked, and somehow having a system which will instantly automatically test all possible items versus all possible communities. You have to be able to see it to know it when you see it. There's probably several papers which can be written in just speculating on a program-based solution.

Given the possibility of jail time if the system is wrong, there should be no doubt that even attempting such a requirement would be extremely costly and burdensome.

III. Any exhibits or articles in support

To recap the exhibit files:

IP Location failures and privacy browsing:

iplocation1.gif iplocation2.gif iplocation3.gif iplocation4.gif iplocation5.gif iplocation6.gif iplocation7.gif iplocation8.gif iplocation9.gif iplocation10.gif metaspinner1.gif pw-nitke1.gif

Privacy/Anonymity information:

anonymizenet1.gif anonymizercom1.gif proxydir1.gif proxydir2.gif quickbrowse1.gif quickbrowse2.gif

Internet Archive:

archive1.gif (definition) archive2.gif (site partially archived) archive3.gif (images archived) archive4.gif (images archived)

Geolocation accuracy:

inaccurate1.gif (vendor's accuracy) geolocationpaper.pdf (ITAA paper) geolocationpaperpr.txt (ITAA press release) spangler.txt ("Interactive Week" article)

Tax Product cost

taxmanager1.gif (Taxware Tax Manager) taxtrain1.gif (Taxware Training)

IV. Qualifications and Resume

(a) Any publications w/in last ten years
Relevant curriculum vitae:

Electronic Frontier Foundation (EFF) Pioneer Award Winner, 2001
http://www.eff.org/awards/20010305_pioneer_pr.html

Blacklisting Bytes - NRC EFF/Finkelstein Censorware White Paper #1
for National Research Council Project on Tools and Strategies for
Protecting Kids from Pornography and Their Applicability to Other
Inappropriate Internet Content
Co-authors: Seth Finkelstein, Consulting Programmer; Lee Tien, Senior
Staff Attorney, EFF
http://www.eff.org/Censorship/Censorware/20010306_eff_nrc_paper1.html
http://www7.nationalacademies.org/itas/whitepaper_1.html

Cited in National Research Council report:
http://www.nap.edu/html/youth_internet/ch2_b5.html

Also a chapter in EPIC's book "Filters & Freedom 2.0"
http://www.epic.org/bookstore/filters2.0/

Computer Security Handbook (4th edition) 
Chapter 51. Censorship and Content Filtering - Lee Tien and Seth Finkelstein
http://www.drj.com/bookstore/drj642.htm

"the programmer principally responsible for the investigation of 
X-Stop filtering software and its flaws, vital to the landmark
Mainstream Loudoun victory"
See http://www.eff.org/Censorship/Internet_censorship_bills/2000/20001222_eff_hr4577_statement.html

Prominent work in 
Amicus Brief for CIPA: Online Policy Group / Seth Finkelstein 
http://www.eff.org/Censorship/Censorware/Multnomah_Library_v_US/cipasupremebrief030210.pdf

Testified to Library of Congress on April 11 2003 in DMCA rulemaking hearing
for anticircumvention exemptions
http://sethf.com/anticensorware/hearing_dc.php

Result was one of only four DMCA exemptions granted
"The Register's recommendation in favor of this exemption is based
primarily on the evidence introduced in the comments and testimony by one
person, Seth Finkelstein, a non-lawyer participating on his own behalf."
See http://www.copyright.gov/1201/docs/registers-recommendation.pdf

Programming consultation for
Programmers' & Academics' Amici Brief in NY MPAA DeCSS Case
http://www.eff.org/IP/Video/MPAA_DVD_cases/20010126_ny_progacad_amicus.html
See http://legalminds.lp.findlaw.com/list/cyberia-l/msg30238.html

Opinion: The TotalNews Case - Confusion in Comprehension, Not Display
The Internet Legal Practice Newsletter   May 1997
[ archived at http://sethf.com/essays/major/totalnews.php ]

(b) [Redacted]

(c) Have you ever testified as an expert before?

No.

Dated: Cambridge, Massachusetts November 10, 2003