In approaching this resource allocation problem we plan to treat the alternative information services as competing economic activities. Given a measure of priorities over the end-user services provided, the various agents effectively compete to provide the highest level of service using the minimal computational resources. We emphasize, however, that the ultimate specification of service priorities is left open to the operators of the digital library. Whether the competition is to be realized via explicit economic transactions or merely an internal "currency" is a policy question on which we take no stance at present. Regardless, if the competition operates smoothly, the result can be an efficient overall allocation of computational resources towards the optimal provision of services to users.
To organize the processing activities within an economic framework, we view the interactions between agents as supplier-producer relationships, where each agent produces value-added information products from the input products provided by others. Agents dynamically connect with each other as opportunities arise for mutually beneficial exchanges. The collections provide the ultimate "raw materials" in this process, whereas the end users are the ultimate consumers of the "finished goods". The intermediate agents ("middlemen") bridge the gap by bringing to bear knowledge, processing, storage, or other computational resources to improve in some way the expected value of the information as it passes along the chain from agent to agent.
Economic mechanisms for allocating computational resources have been studied by
a variety of researchers in recent years (Cheriton & Harty, 1993; Huberman, 1988; Kurose & Simha, 1989; Stonebraker et al.; Waldspurger, Hogg,
Huberman, Kephart, & Stornetta, 1992).
Our implementation of virtual markets in information services will be based on
the idea of "smart auctions" proposed for smooth allocation of bandwidth on the
Internet (MacKie-Mason & Varian, 1993).
The mechanisms for managing multiple, interacting markets will be based on our
previously developed "market-oriented programming" system (Wellman, 1993).
Economic Issues
In order for the production and distribution of information goods to be
economically viable, there must be some mechanism for recovering costs. Yet
pricing information goods is notoriously difficult. The first copy of an
information good will often be very costly to produce, while subsequent copies
may cost next to nothing. The combination of high fixed costs and negligible
marginal costs creates difficulties for conventional forms of pricing. For
example, standard economic theory argues that it is desirable to price goods at
marginal cost. But if the cost of (re)production is zero, marginal cost pricing
will not recover costs.
Conventional markets for information address this problem by bundling the information good with a good that is costly to reproduce: printed books, documentation, user support, a special kind of viewer, etc. We will consider digital-library analogs of this approach, where provision of documents is bundled with special services such as delivery, search, customization, and so on, which add "user-specific value" to the information. For example, a user might want to retrieve cross-tabulated data on the incidence of hurricanes and insurance premiums for different locations, along with newspaper anecdotes and photographs. Such a search would require querying several disparate databases and merging the results.
As this example illustrates, the value added to the user depends on the organization of the information, not simply the raw data. Similarly, the resource cost to the provider depends on the expense of organizing and customizing the information. Since this reflects a non-negligible marginal cost, the charging mechanism can approach the desired economic result. As a side benefit, the fact that information may be organized differently for different users reduces the incentive for unauthorized copying and redistribution.
At an abstract level we can pose the pricing problem as follows. Our objective is to construct a payment scheme that depends only on observable characteristics of users, that maximizes overall benefits subject to the constraint of covering costs. Of course, we have to build into this optimization problem the fact that the users' choices of information services will depend on the nature of the pricing scheme that they face; economists refer to this constraint as the incentive compatibility constraint.
This abstract formulation of the problem can be examined using the methods of mechanism design (Wilson, 1993). In general, we will want prices to differ across users based on both observed characteristics of them and the form and amount of the information delivered. For example, the charge for information could be based on 1) frequency of use, 2) immediacy of delivery, 3) structure or formatting, 4) amount of information retrieved, 5) membership in a group, etc. This means, of course, that the digital library must maintain sufficient records to be able to base charges on these characteristics.
We propose to investigate both the theoretical formulation of the pricing
problem for information goods and the concrete implementation of various
pricing strategies in the context of digital library materials.
Intellectual Property
Information held in the collections may be owned by various entities, some of
which may demand some control over the dissemination of contents or
compensation for access to their copyrighted material. This can be particularly
difficult in dealing with digitized information, since users can copy
information just as efficiently as publishers. Although we cannot in this
project resolve all the thorny issues underlying the notion of intellectual
property in a digital library, we must design our system to accommodate
mechanisms to protect information access and support royalty payments and other
remuneration operations. This includes provisions for executing the
transactions themselves, as well as designing the methods for selection of
information services so that they are sensitive to relative costs and other
economic factors. We will also explore particular schemes for compensating
publishers in a distributed environment. One example is "superdistribution"
(Cox, 1993; Mori & Kawahara, 1990),
where the access of information is free but its use is charged.
Example: An Information Entrepreneur
To understand the virtual information-services marketplace, it may help to
consider the perspective of a specific hypothetical agent that performs one
small function in the process of an information retrieval task. This agent
watches the network for keyword requests (or whatever a meaningful piece of a
query might be--this example is meant to illustrate resource-allocation issues
rather than our information-retrieval model), and produces abstracts of
documents that it believes are salient to those keywords. This agent's
activities demand resources--processing, storage, and access to documents--and
its value to the system as a whole must justify the allocation of these
resources.
To ensure that resources are allocated to the most valuable enterprises, we measure the value of this agent's product by direct compensation. For example, we could pay the agent a fee for each abstract produced. However, we must also provide an incentive for the agent to produce relevant abstracts. We could have the consumer of the abstract (another mediator agent, or ultimately the end user) pass judgment about relevance, but we also want to ensure that the value of a document is not misrepresented in order to evade fees. One way to do this is to provide the abstracts for free, but then charge the finder's commission whenever the full document is retrieved. Presumably, this is an accurate signal of the perceived value of the document.
It may seem that the agent still has an incentive to over-produce abstracts, since increasing the number of abstracts offered might increase the number of documents retrieved. However, an astute information entrepreneur will recognize that offering irrelevant documents will damage its reputation as an information agent, and in the long run decrease its commissions. By keeping track of the identities of other agents they interact with, agents can adapt to patronize those performing most effectively.
One sharp example of the importance of reputation is in the prevalence of subscription services. Readers may subscribe to The New York Times, for example, because this newspaper has a track record of providing news relevant to their interests. Similarly, we expect subscription services to proliferate in the digital library, though with far greater customization and flexibility than traditional subscription services. The concept of standing requests and proactive agents can be viewed as a form of subscription service.
This example illustrates some of the economic issues that come up in designing the configuration of agents comprising the digital library. The key point is that designing a distributed system of this sort is largely a matter of getting the incentives right, and economic analysis can play a central role.