Bibliography of the BookModeling the Internet and the WebProbabilistic Methods and AlgorithmsPierre Baldi, Paolo Frasconi, Padhraic Smyth 
Several cited papers are online and can be retrieved following the provided links. Legend: [.html], [.pdf], [.ps], [.ps.Z], [.ps.gz]: author's version of the paper, in the specified format [CS]: document page at CiteSeer [Pub]: document page at publisher's site Author's links may become broken. In these cases papers might be still retrievable from the cache at CiteSeer. A superset of the bibliography of "Modeling the Internet" is also available in BibTeX format: ModelingTheInternet.bib.gz 
Abello, J., Buchsbaum, A. and Westbrook, J. 1998 A functional approach to external graph algorithms. Proc. 6th European Symp. on Algorithms, pp. 332–343. [.ps.Z] [CS] Achacoso, T. B. and Yamamoto,W. S. 1992 Ay’s Neuroanatomy of C. elegans for Computation. Boca Raton, FL: CRC Press. Aczel, J. and Daroczy, Z. 1975 On measures of information and their characterizations. New York: Academic Press. Adamic, L., Lukose, R. M., Puniyani, A. R. and Huberman, B. A. 2001 Search in powerlaw networks. Phys. Rev. E 64, 046135. [Pub] Aggarwal, C. C., AlGarawi, F. and Yu, P. S. 2001 Intelligent crawling on the World Wide Web with arbitrary predicates. In Proc. 10th Int. World Wide Web Conf., pp. 96–105. [Pub] [CS] Aiello,W., Chung, F. and Lu, L. 2001A Random Graph Model for Power Law Graphs. Experimental Math. 10, 53–66. [.pdf] [CS] Aji, S. M. and McEliece, R. J. 2000 The generalized distributive law. IEEE Trans. Inform. Theory 46, 325–343. [.pdf] Albert, R. and Barabási, A.L. 2000 Topology of evolving networks: local events and universality. Phys. Rev. Lett. 85, 5234–5237. [.pdf] Albert, R., Jeong, H. and Barabási, A.L. 1999 Diameter of the WorldWide Web. Nature 401, 130. [.pdf] Albert, R., Jeong, H. and Barabási, A.L. 2000 Error and attack tolerance of complex networks. Nature 406, 378–382. [.pdf] Allwein, E. L., Schapire, R. E. and Singer, Y. 2000 Reducing multiclass to binary: a unifying approach for margin classifiers. Proc. 17th Int. Conf. on Machine Learning, pp. 9–16. San Francisco, CA: Morgan Kaufmann. [Pub] [CS] Amaral, L. A. N., Scala, A., Barthélémy, M. and Stanley, H. E. 2000 Classes of smallworld networks. Proc. Natl Acad. Sci. 97, 11 149–11 152. [.pdf] Amento, B., Terveen, L. and Hill, W. 2000 Does authority mean quality? Predicting expert quality ratings ofWeb documents. Proc. 23rd Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 296–303. New York: ACM Press. [.pdf] [CS] Anderson, C. R., Domingos, P. and Weld, D. 2001 Adaptive Web navigation for wireless devices. Proc. 17th Int. Joint Conf. on Artificial Intelligence, pp. 879–884. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Anderson, C. R., Domingos, P. andWeld, D. 2002 Relational markov models and their application to adaptive Web navigation. Proc. 8th Int. Conf. on Knowledge Discovery and Data Mining, pp. 143–152. New York: ACM Press. [.pdf] Androutsopoulos, I., Koutsias, J., Chandrinos, K. and Spyropoulos, D. 2000 An experimental comparison of naive Bayesian and keywordbased antispam filtering with personal email messages. Proc. 23rd ACM SIGIR Ann. Conf., pp. 160–167. [.pdf] Ansari, A. and Mela, C. 2003 Ecustomization. J. Market. Res. (In the press.) [.pdf] Apostol, T. M. 1969 Calculus, vols I and II. John Wiley & Sons, Ltd/Inc. Appelt, D., Hobbs, J., Bear, J., Israel, D., Kameyama, M., Kehler, A., Martin, D., Meyers K. and Tyson, M. 1995 SRI International FASTUS system: MUC6 test results and analysis. Proc. 6th Message Understanding Conf. (MUC6), pp. 237–248. San Francisco, CA: Morgan Kaufmann. [.ps.gz] [CS] Apté C., Damerau, F. and Weiss, S. M. 1994 Automated learning of decision rules for text categorization. (Special Issue on Text Categorization.) ACM Trans. Informat. Syst. 12, 233–251. [.pdf] [CS] Araújo M. D., Navarro, G. and Ziviani, N. 1997 Large text searching allowing errors. In Proc. 4th South American Workshop on String Processing (ed. R. BaezaYates), International Informatics Series, pp. 2–20. Ottawa: Carleton University Press. [Pub] [CS] Armstrong, R., Freitag, D., Joachims, T. and Mitchell, T. 1995 WebWatcher: a learning apprentice for the World Wide Web. Proc. 1995 AAAI Spring Symp. on Information Gathering from Heterogeneous, Distributed Environments, pp. 6–12. [.pdf] [CS] Aslam, J. A. and Montague, M. 2001 Models for metasearch. Proc. 24th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 276–284. New York: ACM Press. [.ps] Baldi, P. 2002 A computational theory of surprise. In Information, Coding, and Mathematics (ed. M. Blaum, P. G. Farrell and H. C. A. van Tilborg), pp. 1–25. Boston, MA: Kluwer Academic. Baldi, P. and Brunak, S. 2001 Bioinformatics: The Machine Learning Approach, 2nd edn. MIT Press, Cambridge, MA. [Pub] Baluja, S., Mittal,V. and Sukthankar, R. 2000 Applying machine learning for high performance namedentity extraction. Computat. Intell. 16, 586–595. [.pdf] [CS] Barabási, A.L. and Albert, R. 1999 Emergence of scaling in random networks. Science 286, 509–512. [.pdf] Barabási, A.L., Albert, R. and Jeong, H. 1999 Meanfield theory for scalefree random networks. Physica A 272, 173–187. [.pdf] [CS] Barabási, A.L., Freeh,V.W., Jeong, H. and Brockman, J. B. 2001 Parasitic computing. Nature 412, 894–897. [.pdf] Barlow, R. and Proshan, F. 1975 Statistical Theory of Reliability and Life Testing. Austin, TX: Holt, Rinehart and Winston. Barthélémy, M. and Amaral, L. A. N. 1999 Smallworld networks: evidence for a crossover picture. Phys. Rev. Lett. 82, 3180–3183. [.pdf] Bass, F. M. 1969 A new product growth model for consumer durables. Mngmt Sci. 15, 215–227. Bellman, R. E. 1957 Dynamic Programming. Princeton, NJ: Princeton University Press. [.pdf] Berger, J. O. 1985 Statistical Decision Theory And Bayesian Analysis. Springer. Bergman, M. 2000 The Deep Web: Surfacing Hidden Value. J. Electron. Publ. 7. (Available from http://www.completeplanet.com/Tutorials/DeepWeb/.) BernersLee, T. 1994 Universal Resource Identifiers in WWW: A Unifying Syntax for the Expression of Names and Addresses of Objects on the Network as used in theWorldWide Web. RFC 1630. (Available from http://www.ietf.org/rfc/rfc1630.txt.) BernersLee, T., Fielding, R. and Masinter, L. 1998 Uniform Resource Identifiers (URI): Generic Syntax. RFC 2396. (Available from http://www.ietf.org/rfc/rfc2396.txt.) Berry,M.W. 1992 Large scale singular value computations. J. Supercomput. Applic. 6, 13–49. Berry, M. W. and Browne, M. 1999 Understanding Search Engines: Mathematical Modeling and Text Retrieval. Philadelphia, PA: Society for Industrial and Applied Mathematics. Bharat, K. and Broder,A. 1998 A technique for measuring the relative size and overlap of public Web search engines. Proc. 7th Int. World Wide Web Conf., Brisbane, Australia, pp. 379–388. [Pub] Bharat, K. and Henzinger, M. R. 1998 Improved algorithms for topic distillation in a hyperlinked environment. Proc. 21st Ann Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 104–111. New York: ACM Press. [.pdf] [CS] Bianchini, M., Gori, M. and Scarselli, F. 2001 Inside Google’s Web page scoring system. Technical report, Dipartimento di Ingegneria dell’Informazione, Università di Siena. Bikel, D. M., Miller, S., Schwartz, R. and Weischedel, R. 1997 Nymble: a highperformance learning namefinder. In Proceedings of ANLP97, pp. 194–201. (Available from http://citeseer.ist.psu.edu/bikel97nymble.html.) Billsus, D. and Pazzani, M. 1998 Learning collaborative information filters. Proc. Int. Conf. on Machine Learning, pp. 46–54. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Blahut, R. E. 1987 Principles and Practice of Information Theory. Reading, MA: AddisonWesley. Blei, D., Ng, A. Y. and Jordan, M. I. 2002a Hierarchical Bayesian models for applications in information retrieval. In Bayesian Statistics 7 (ed. J. M. Bernardo, M. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M.West). Oxford University Press. [.ps] Blei, D., Ng, A.Y. and Jordan, M. I. 2002b Latent Dirichlet allocation. In Advances in Neural Information Processing Systems 14 (ed. T. Dietterich, S. Becker and Z. Ghahramani). San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Blum, A. and Mitchell, T. 1998 Combining labeled and unlabeled data with cotraining. Proc. 11th Ann. Conf. on Computational Learning Theory (COLT98), pp. 92–100. New York: ACM Press. [.ps] [CS] Bollacker, K. D., Lawrence, S. and Giles, C. L. 1998 CiteSeer: an autonomous Web agent for automatic retrieval and identification of interesting publications. In Proc. 2nd Int. Conf. on Autonomous Agents (Agents’98) (ed. K. P. Sycara and M. Wooldridge), pp. 116–123. New York: ACM Press. [.ps.gz] [CS] Bollobás, B. 1985 Random Graphs. London: Academic Press. Bollobás, B. and de laVega,W. F. 1982 The diameter of random regular graphs. Combinatorica 2, 125–134. Bollobás, B. and Riordan, O. 2003 The diameter of a scalefree random graph. Combinatorica. (In the press.) Bollobás, B., Riordan, O., Spencer, J. and Tusnády G. 2001 The degree sequence of a scalefree random graph process. Random. Struct. Alg. 18, 279–290. Borodin, A., Roberts, G. O., Rosenthal, J. S. and Tsaparas, P. 2001 Finding authorities and hubs from link structures on the World Wide Web. Proc. 10th Int. Conf. on World Wide Web, pp. 415–429. [Pub] Box, G. E. P. and Tiao, G. C. 1992 Bayesian Inference In Statistical Analysis. John Wiley & Sons, Ltd/Inc. Boyan, J., Freitag, D. and Joachims, T. 1996 A machine learning architecture for optimizing Web search engines. Proc. AAAI Workshop on InternetBased Information Systems. [.ps.gz] [CS] Brand, M. 2002 Incremental singular value decomposition of uncertain data with missing values. Proc. European Conf. on Computer Vision (ECCV): Lecture Notes in Computer Science, pp. 707–720. Springer. [.pdf] Bray, T. 1996 Measuring the Web. In Proc. 5th Int. Conf. on the World Wide Web, 6–10 May 1996, Paris, France. Comp. Networks 28, 993–1005. [Pub] Breese, J. S., Heckerman, D. and Kadie, C. 1998 Empirical analysis of predictive algorithms for collaborative filtering. Proc. 14th Conf. on Uncertainty in Artificial Intelligence, pp. 43–52. San Francisco, CA: Morgan Kaufmann. [CS] Brewington, B. and Cybenko, G. 2000 How dynamic is the Web? Proc. 9th Int. World Wide Web Conf. Geneva: International World Wide Web Conference Committee (IW3C2). [Pub] [CS] Brin, S. and Page, L. 1998 The anatomy of a largescale hypertextual (Web) search engine. In Proc. 7th Int. World Wide Web Conf. (WWW7). Comp. Networks 30, 107–117. [Pub] [CS] Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomikns, A. and Wiener, J. 2000 Graph structure in the Web. In Proc. 9th Int. World Wide Web Conf. (WWW9). Comp. Networks 33, 309–320. [Pub] Brown, L. D. 1986 Fundamentals of Statistical Exponential Families. Hayward, CA: Institute of Mathematical Statistics. Bucklin, R. E. and Sismeiro, C. 2003 A model of Web site browsing behavior estimated on clickstream data. (In the press.) [.pdf] Buntine,W. 1992 Learning classification trees. Statist. Comp. 2, 63–73. [CS] Buntine,W. 1996 A guide to the literature on learning probabilistic networks from data. IEEE Trans. Knowl. Data Engng 8, 195–210. [CS] Byrne, M. D., John, B. E., Wehrle, N. S. and Crow, D. C. 1999 The tangled Web we wove: a taskonomy of WWW use. Proc. CHI’99: Human Factors in Computing Systems, pp. 544– 551. New York: ACM Press. [.pdf] Cadez, I. V., Heckerman, D., Smyth, P., Meek, C. and White, S. 2003 Modelbased clustering and visualization of navigation patterns on a Web site. Data Mining Knowl. Discov. (In the press.) [.pdf] Califf, M. E. and Mooney, R. J. 1998 Relational learning of patternmatch rules for information extraction. Working Notes of AAAI Spring Symp. on Applying Machine Learning to Discourse Processing, pp. 6–11. Menlo Park, CA: AAAI Press. [.pdf] [CS] Callaway, D. S., Hopcroft, J. E., Kleinberg, J., Newman, M. E. J. and Strogatz, S. H. 2001 Are randomly grown graphs really random? Phys. Rev. E 64, 041902. [.pdf] Cardie, C. 1997 Empirical methods in information extraction. AI Mag. 18, 65–80. Carlson, J. M. and Doyle, J. 1999 Highly optimized tolerance: a mechanism for power laws in designed systems. Phys. Rev. E 60, 1412–1427. [.ps] [CS] Castelli, V. and Cover, T. 1995 On the exponential value of labeled samples. Pattern Recog. Lett. 16, 105–111. Catledge, L. D. and Pitkow, J. 1995 Characterizing browsing strategies in the WorldWide Web. Comp. Networks ISDN Syst. 27, 1065–1073. [Pub] [CS] Chaitin, G. J. 1987 Algorithmic Information Theory. Cambridge University Press. Chakrabarti, S., Dom, B., Gibson, D., Kleinberg, J., Kumar, S. R., Raghavan, P., Rajagopalan S. and Tomkins, A. 1999a Mining the link structure of the World Wide Web. IEEE Computer 32, 60–67. [.ps] [CS] Chakrabarti, S., Joshi, M. M., Punera, K. and Pennock, D. M. 2002 The structure of broad topics on the Web. Proc. 11th Int. Conf. on World Wide Web, pp. 251–262. New York:ACM Press. [.pdf] [CS] Chakrabarti, S., van den Berg, M. and Dom, B. 1999b Focused crawling: a new approach to topicspecific Web resource discovery. In Proc. 8th Int. World Wide Web Conf., Toronto. Comp. Networks 31, 11–16. [.pdf] [CS] Charniak, E. 1991 Bayesian networks without tears. AI Mag. 12, 50–63. Chen, S. F. and Goodman, J. 1996 An empirical study of smoothing techniques for language modeling. In Proc. 34th Ann. Meeting of the Association for Computational Linguistics (ed. A. Joshi and M. Palmer), pp. 310–318. San Francisco, CA: Morgan Kaufmann. [.pdf] Chickering, D. M., Heckerman, D. and Meek, C. 1997 A Bayesian approach to learning Bayesian networks with local structure. Uncertainty in Artificial Intelligence: Proc. 13th Conf. (UAI1997), pp. 80–89. San Francisco, CA: Morgan Kaufmann. [CS] Cho, J. and GarciaMolina, H. 2000a Estimating frequency of change. Technical Report DBPUBS4, Stanford University. (Available via http://dbpubs.stanford.edu/pub/20004.) Cho, J. and GarciaMolina, H. 2000b Synchronizing a database to improve freshness. Proc. 2000 ACM Int. Conf. on Management of Data (SIGMOD), pp. 117–128. [.pdf] [CS] Cho, J. and GarciaMolina, H. 2002 Parallel crawlers. Proc. 11th World Wide Web Conf. (WWW11), Honolulu, Hawaii. [.pdf] [CS] Cho, J. and Ntoulas, A. 2002 Effective change detection using sampling. Proc. 28th Int. Conf. on Very Large Databases (VLDB). [.pdf] [CS] Cho, J., GarciaMolina, H. and Page, L. 1998 Efficient crawling through URL ordering. In Proc. 7th Int. World Wide Web Conf. (WWW7). Comp. Networks 30, 161–172. [Pub] [.ps] [CS] Chung, E. R. K., Graham, R. L. andWilson, R. M. 1989 Quasirandom graphs. Combinatorica 9, 345–362. [.pdf] Chung, F. and Graham, R. 2002 Sparse quasirandom graphs. Combinatorica 22, 217–244. [.pdf] Chung, F. and Lu, L. 2001 The diameter of random sparse graphs. Adv. Appl. Math. 26, 257–279. [.pdf] [CS] Chung, F. and Lu, L. 2002 Connected components in random graphs with given expected degree sequences. Technical Report, Univeristy of California, San Diego. [.pdf] Chung, F., Garrett, M., Graham, R. and Shallcross, D. 2001 Distance realization problems with applications to Internet tomography. J. Comput. Syst. Sci. 63, 432–448. [.pdf] Clarke, I., Miller, S. G., Hong,T.W., Sandberg, O. and Wiley, B. 2002 Protecting free expression online with Freenet. IEEE Internet Computing 6, 40–49. [.pdf] [CS] Clarke, I., Sandberg, O., Wiley, B. and Hong, T. W. 2000 Freenet: a distributed anonymous information storage and retrieval system. In Designing Privacy Enhancing Technologies: International Workshop on Design Issues in Anonymity and Unobservability, LNCS 2009 (ed. H. Federrath), pp. 311–320. Springer. [.pdf] [CS] Cockburn, A. and McKenzie, B. 2002 What do Web users do? An empirical analysis of Web use. Int. J. Human Computer Studies 54, 903–922. [.pdf] [CS] Cohen, W. W. 1995 Text categorization and relational learning. In Proc. ICML95, 12th Int. Conf. on Machine Learning (ed. A. Prieditis and S. J. Russell), pp. 124–132. San Francisco, CA: Morgan Kaufmann. [.ps] [CS] Cohen, W. W. 1996 Learning rules that classify email. In AAAI Spring Symp. on Machine Learning in Information Access (ed. M. Hearst and H. Hirsh). 1996 Spring Symposium Series. Menlo Park, CA: AAAI Press. [.ps] [CS] Cohen, W. W. and McCallum, A. 2002 Information Extraction from the World Wide Web. Tutorial presented at 15th Neural Information Processing Conf. (NIPS15). [.ps] Cohen,W.W., Schapire, R. E. and Singer,Y. 1999 Learning to order things. J. Artif. Intell. Res. 10, 243–270. [Pub] [CS] Cohen,W.W., McCallum, A. and Quass, D. 2000 Learning to understand the Web. IEEE Data Enging Bull. 23, 17–24. [.pdf] [CS] Cohn, D. and Chang, H. 2000 Learning to probabilistically identify authoritative documents. Proc. 17th Int. Conf. on Machine Learning, pp. 167–174. San Francisco, CA: Morgan Kaufmann. [.ps.gz] [CS] Cohn, D. and Hofmann, T. 2001 The missing link: a probabilistic model of document content and hypertext connectivity. In Advances in Neural Information Processing Systems (ed. T. K. Leen, T. G. Dietterich and V. Tresp). Boston, MA: MIT Press. [.pdf] [CS] Cooley, R., Mobasher, B. and Srivastava, J. 1999 Data preparation for mining World Wide Web browsing patterns. Knowl. Informat. Syst. 1, 5–32. [.pdf] [CS] Cooper, C. and Frieze, A. 2001 A general model of Web graphs. Technical Report. [.pdf] [CS] Cooper, G. F. 1990 The computational complexity of probabilistic inference using Bayesian belief networks. Art. Intell. 42, 393–405. Cooper,W. S. 1991 Some inconsistencies and misnomers in probabilistic information retrieval. Proc. 14th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 57–61. New York: ACM Press. [CS] Cormen, T. H., Leiserson, C. E., Rivest, R. L. and Stein, C. 2001 Introduction to Algorithms, 2nd edn. Cambridge, MA: MIT Press. Cortes, C. and Vapnik, V. N. 1995 Support vector networks. Machine Learning 20, 1–25. Cover, T. M. and Hart, P. E. 1967 Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13, 21–27. [.ps.gz] [CS] Cover, T. M. and Thomas, J. A. 1991 Elements of Information Theory. John Wiley & Sons, Ltd/Inc. Cox, R. T. 1964 Probability, frequency and reasonable expectation. Am. J. Phys. 14, 1–13. Crammer, K. and Singer,Y. 2000 On the learnability and design of output codes for multiclass problems. Proc. 13 Conf. Computational Learning Theory, pp. 35–46. [.ps.gz] [CS] Craven, M., di Pasquo D., Freitag, D., McCallum A., Mitchell, T., Nigan, K. and Slattery, S. 2000 Learning to construct knowledge bases from the World Wide Web. Artif. Intel. 118(1–2), 69–113. [.pdf] [CS] Craven, M. and Slattery, S. 2001 Relational learning with statistical predicate invention: better models for hypertext. Machine Learning 43(1/2), 97–119. [.pdf] [CS] Cristianini, N. and ShaweTaylor, J. 2000 An Introduction to Support Vector Machines. Cambridge University Press. Daley, D. J. and Gani, J. 1999 Epidemic Modeling: an Introduction. Cambridge University Press. Davison, B. D. 2000a Recognizing nepotistic links on the Web. AAAI Workshop on Artificial Intelligence for Web Search. [.pdf] [CS] Davison, B. D. 2000b Topical locality in the Web. Proc. 23rd Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 272–279. [.pdf] [CS] Dawid, A. P. 1992 Applications of a general propagation algorithm for probabilistic expert systems. Stat. Comp. 2, 25–36. Day, J. 1995 The (un)revised OSI reference model. ACM SIGCOMM Computer Communication Review 25, 39–55. [Pub] Day, J. and Zimmerman, H. 1983 The OSI reference model. Proc. IEEE 71, 1334–1340. De Bra, P. and Post, R. 1994 Information retrieval in the World Wide Web: making clientbased searching feasible. Proc. 1st Int. World Wide Web Conf. [.ps] [CS] Dechter, R. 1999 Bucket elimination: A unifying framework for reasoning. Artif. Intel. 113, 41–85. [.ps] [CS] Deering, S. and Hinden, R. 1998 Internet Protocol, Version 6 (IPv6) Specification. RFC 2460. (Available from http://www.ietf.org/rfc/rfc2460.txt.) Del Bimbo A. 1999 Visual Information Retrieval. San Francisco, CA: Morgan Kaufmann. Dempster, A. P., Laird, N. M. and Rubin, D. B. 1977 Maximum likelihood from incomplete data via the em algorithm. Journal Royal Statistical Society B39, 1–22. Deshpande, M. and Karypis, G. 2001 Selective Markov models for predicting Webpage accesses. Proc. SIAM Conf. on Data Mining SIAM Press. [.pdf] [CS] Dhillon, I. S., Fan, J. and Guan,Y. 2001 Efficient clustering of very large document collections In Data Mining for Scientific and Engineering Applications (ed. Grossman, R., Kamath, C. and Naburu R). Kluwer Academic. [.ps.gz] [CS] Dhillon, I. S. and Modha, D. S. 2001 Concept decompositions for large sparse text data using clustering. Machine Learning 42, 143–175. [.ps.gz] [CS] Dietterich, T. G. and Bakiri, G. 1995 Solving multiclass learning problems via errorcorrecting output codes. J. Artificial Intelligence Research 2, 263–286. [.pdf] [CS] Dijkstra, E. D. 1959A note on two problem in connexion with graphs. Numerische Mathematik 1, 269–271. Diligenti, M., Coetzee, F., Lawrence, S., Giles, C. L. and Gori, M. 2000 Focused crawling using context graphs. In VLDB 2000, Proc. 26th Int. Conf. on Very Large Data Bases, 10–14 September 2000, Cairo, Egypt (ed. A. El Abbadi, M. L. Brodie, S. Chakravarthy, U. Dayal, N. Kamel, G. Schlageter and K.Y. Whang), pp. 527–534. Los Altos, CA: Morgan Kaufmann. [.pdf] [CS] Dill, S., Kumar, S. R., McCurley, K. S., Rajagopalan, S., Sivakumar, D. and Tomkins, A. 2001 Selfsimilarity in the Web. Proc. VLDB, pp. 69–78. [Pub] [CS] Domingos, P. and Pazzani, M. 1997 On the optimality of the simple Bayesian classifier under zeroone loss. Machine Learning 29, 103–130. [.pdf] [CS] Domingos, P. and Richardson, M. 2001 Mining the network value of customers. Proc. ACM 7th Int. Conf. on Knowledge Discovery and Data Mining, pp. 57–66. New York: ACM Press. [.pdf] [CS] Dreilinger, D. and Howe, A. E. 1997 Experiences with selecting search engines using metasearch. ACM Trans. Informat. Syst. 15, 195–222. [.pdf] [CS] Drucker, H.,Vapnik,V. N. and Wu, D. 1999 Support vector machines for spam categorization. IEEE Trans. Neural Networks 10, 1048–1054. Duda, R. O. and Hart, P. E. 1973 Pattern Classification and Scene Analysis. John Wiley & Sons, Ltd/Inc. Dumais, S., Platt, J., Heckerman, D. and Sahami, M. 1998 Inductive learning algorithms and representations for text categorization. In Proc. 7th Int. Conf. on Information and Knowledge Management, pp. 148–155. New York: ACM Press. [.pdf] Jones, K. S. and Willett, P. (eds) 1997 Readings in information retrieval. San Mateo, CA: Morgan Kaufmann. Edwards, J., McCurley, K. and Tomlin, J. 2001 An adaptive model for optimizing performance of an incremental Web crawler. Proc. 10th Int. World Wide Web Conf., pp. 106–113. [Pub] [CS] Elias, P. 1975 Universal codeword sets and representations of the integers. IEEE Trans. Inform. Theory 21, 194–203. Erdos, P. and Rényi, A. 1959 On random graphs. Publ. Math. Debrecen 6, 290–291. Erdos, P. and Rényi, A. 1960 On the evolution of random graphs. Magy. Tud. Akad. Mat. Kut. Intez. Kozl. 5, 17–61. Everitt B. S. 1984 An Introduction to Latent Variable Models. London: Chapman & Hall. Evgeniou, T., Pontil, M. and Poggio, T. 2000 Regularization networks and support vector machines. Adv. Comput. Math. 13, 1–50. [.ps] [CS] Fagin, R., Karlin, A., Kleinberg, J., Raghavan, P., Rajagopalan, S., Rubinfeld, R., Sudan., M. and Tomkins, A. 2000 Random walks with ‘back buttons’. Proc. ACM Symp. on Theory of Computing, pp. 484–493. New York: ACM Press. [.ps] [CS] Faloutsos, C. and Christodoulakis, S. 1984 Signature files: an access method for documents and its analytical performance evaluation. ACM Trans. Informat. Syst. 2, 267–288. Faloutsos, M., Faloutsos, P. and Faloutsos, C. 1999 On powerlaw relationships of the Internet topology. Proc. ACM SIGCOMM Conf., Cambridge, MA, 251–262. [.pdf] [CS] Feller, W. 1971 An Introduction to Probability Theory and its Applications, 2nd edn, vol. 2. John Wiley & Sons, Ltd/Inc. Fermi, E. 1949 On the origin of the cosmic radiation. Phys. Rev. 75, 1169–1174. Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P. and BernersLee, T. 1999 Hypertext Transfer Protocol: HTTP/1.1. RFC 2616. (Available from http://www.ietf.org/rfc/ Fienberg, S. E., Johnson, M. A. and Junker, B. J. 1999 Classical multilevel and Bayesian approaches to population size estimation using multiple lists. J. Roy. Statist. Soc. A162, 383–406. Flake, G.W., Lawrence, S. and Giles, C. L. 2000 Efficient identification of Web communities. Proc. 6th ACMSIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 150–160. New York: ACM Press. [.pdf] [CS] Flake, G.W., Lawrence, S., Giles, C. L. and Coetzee, F. 2002 Selforganization and identification of Web communities. IEEE Computer 35, 66–71. [.pdf] Fox, C. 1992 Lexical analysis and stoplists. In Information Retrieval: Data Structures and Algorithms (ed. W. B. Frakes and R. BaezaYates), ch. 7. Englewood Cliffs, NJ: Prentice Hall. Fraley, C. and Raftery, A. E. 2002 Modelbased clustering, discriminant analysis, and density estimation. J. Am. Statist. Assoc. 97, 611–631. [.pdf] [CS] Freitag, D. 1998 Information extraction from HTML: Application of a general machine learning approach. Proc. AAAI98, pp. 517–523. Menlo Park, CA: AAAI Press. [.ps.gz] [CS] Freitag, D. and McCallum, A. 2000 Information extraction with HMM structures learned by stochastic optimization. Proc. AAAI/IAAI, pp. 584–589. [.ps.gz] [CS] Freund,Y. and Schapire, R. E. 1996 Experiments with a new boosting algorithm. In Proc. 13th Int. Conf. on Machine Learning, pp. 148–146. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Frey, B. J. 1998 Graphical Models for Machine Learning and Digital Communication. MIT Press. Friedman, N. and Goldszmidt, M. 1996 Learning Bayesian networks with local structure. In Proc. 12th Conf. on Uncertainty in Artificial Intelligence, Portland, Oregon (ed. E. Horwitz and F. Jensen), pp. 274–282. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Friedman, N., Getoor, L., Koller, D. and Pfeffer, A. 1999 Learning probabilistic relational models. In Proc. 16th Int. Joint Conf. on Artificial Intelligence (IJCAI99) (ed. D. Thomas), vol. 2 , pp. 1300–1309. San Francisco, CA: Morgan Kaufmann. [.ps] [CS] Fuhr, N. 1992 Probabilistic models in information retrieval. Comp. J. 35, 243–255. [.ps.gz] [CS] Galambos, J. 1987 The Asymptotic Theory of Extreme Order Statistics, 2nd edn. Malabar, FL: Robert E. Krieger. Garfield, E. 1955 Citation indexes for science: a new dimension in documentation through association of ideas. Science 122, 108–111. Garfield, E. 1972 Citation analysis as a tool in journal evaluation. Science 178, 471–479. [.pdf] Garner, R. 1967 A Computer Oriented, Graph Theoretic Analysis of Citation Index Structures. Philadelphia, PA: Drexel University Press. Gelbukh, A. and Sidorov, G. 2001 Zipf and Heaps Laws’ coefficients depend on language. Proc. 2001 Conf. on Intelligent Text Processing and Computational Linguistics, pp. 332– 335. Springer. [.html] Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. 1995 Bayesian Data Analysis. London: Chapman & Hall. Ghahramani, Z. 1998 Learning dynamic Bayesian networks. In Adaptive Processing of Sequences and Data Structures. Lecture Notes in Artifical Intelligence (ed. M. Gori and C. L. Giles), pp. 168–197. Springer. [.ps.gz] [CS] Ghahramani, Z. and Jordan, M. I. 1997 Factorial hidden Markov models. Machine Learning 29, 245–273. [.ps.gz] [CS] Ghani, R. 2000 Using errorcorrecting codes for text classification. Proc. 17th Int. Conf. on Machine Learning, pp. 303–310. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Gibson, D., Kleinberg, J. and Raghavan, P. 1998 Inferring Web communities from link topology. Proc. 9th ACM Conf. on Hypertext and Hypermedia : Links, Objects, Time and Space structure in Hypermedia Systems, pp. 225–234. New York: ACM Press. [.pdf] [CS] Gilbert, E. N. 1959 Random graphs. Ann. Math. Statist. 30, 1141–1144. Gilbert, N. 1997 A simulation of the structure of academic science. Sociological Research Online 2. (Available from http://www.socresonline.org.uk/socresonline/2/2/3.html) Gilks, W. R., Thomas, A. and Spiegelhalter, D. J. 1994 A language and program for complex Bayesian modelling. The Statistician 43, 69–78. Greenberg, S. 1993 The Computer User as Toolsmith: the Use, Reuse, and Organization or ComputerBased Tools. Cambridge University Press. Guermeur, Y., Elisseeff, A. and PaugamMousy, H. 2000 A new multiclass SVM based on a uniform convergence result. In Proc. IJCNN: Int. Joint Conf. on Neural Networks, vol. 4, pp 4183–4188. Piscataway, NJ: IEEE Press. [.ps] [CS] Han, E. H., Karypis, G. and Kumar,V. 2001 Text categorization using weightadjusted knearest neighbor classification. In Proc. PAKDD01, 5th Pacific–Asia Conferenece on Knowledge Discovery and Data Mining (ed. D. Cheung, Q. Li and G. Williams). Lecture Notes in Computer Science Series, vol. 2035, pp. 53–65. Springer. [.pdf] [CS] Han, J. and Kamber, M. 2001 Data Mining: Concepts and Techniques. San Francisco, CA: Morgan Kaufmann. Hand, D., Mannila, H. and Smyth, P. 2001 Principles of Data Mining. Cambridge, MA: MIT Press. Harman, D., BaezaYates, R., Fox, E. and Lee,W. 1992 Inverted files. In Information Retrieval, Data Structures and Algorithms (ed. W. B. Frakes and R. A. BaezaYates), pp. 28–43. Englewood Cliffs, NJ: Prentice Hall. Hastie, T., Tibshirani, R. and Friedman, J. 2001 Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. Heckerman, D. 1998 A tutorial on learning with Bayesian networks. In Learning in Graphical Models (ed. M. Jordan). Kluwer. Heckerman, D., Chickering, D. M., Meek, C., Rounthwaite, R. and Kadie, C. 2000 Dependency networks for inference, collaborative filtering, and data visualization. J. Mach. Learn. Res. 1, 49–75. [Pub] [CS] Hersovici, M., Jacovi, M., Maarek,Y. S., Pelleg, D., Shtalhaim, M. and Ur, S. 1998 The shark search algorithm – an application: tailored Web site mapping. In Proc. 7th Int. WorldWide Web Conf. Comp. Networks 30, 317–326. [Pub] [CS] Heydon, A. and Najork, M. 1999 Mercator: a scalable, extensible Web crawler. Proc. World Wide Web Conf. 2, 219–229. (Available from http://research.compaq.com/SRC/mercator/research.html.) Heydon, A. and Najork, M. 2001 Highperformance Web crawling. Technical Report SRC 173. Compaq Systems Research Center. [CS] Hoffman, T. 1999 Probabilistic latent semantic indexing. Proc. 22nd Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 50–57. New York: ACM Press. [.pdf] [CS] Hofmann, T. 2001 Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42, 177–196. [.pdf] Hofmann, T. and Puzicha, J. 1999 Latent class models for collaborative filtering. In Proc. 16th Int. Joint Conf. on Artificial Intelligence, pp. 688–693. [.pdf] Hofmann, T., Puzicha, J. and Jordan, M. I. 1999 Learning from dyadic data. In Advances in Neural Information Processing Systems 11: Proc. 1998 Conf. (ed. M. S. Kearns, S. A. Solla and D. Cohen), pp. 466–472. Cambridge, MA: MIT Press. [Pub] Huberman, B. A. and Adamic, L. A. 1999 Growth dynamics of the World Wide Web. Nature 401, 131. [.pdf] Huberman, B. A., Pirolli, P. L. T., Pitkow, J. E. and Lukose, R. M. 1998 Strong regularities in World Wide Web surfing. Science 280, 95–97. [.pdf] [CS] Hunter, J. and Shotland, R. 1974 Treating data collected by the smallworld method as a Markov process. Social Forces 52, 321. ISO 1986 Information Processing, Text and Office Systems, Standard Generalized Markup Language (SGML), ISO 8879, 1st edn. Geneva, Switzerland: International Organization for Standardization. Itti, L. and Koch, C. 2001 Computational modelling of visual attention. Nature Rev. Neurosci. 2, 194–203. [.pdf] Jaakkola, T. S. and Jordan, I. 1997 Recursive algorithms for approximating probabilities in graphical models. In Advances in Neural Information Processing Systems (ed. M. C. Mozer, M. I. Jordan and T. Petsche), vol. 9, pp. 487–493. Cambridge, MA: MIT Press. [.ps] [CS] Jaeger, M. 1997 Relational Bayesian networks In Proc. 13th Conf. on Uncertainty in Artificial Intelligence (UAI97) (ed. D. Geiger and P. P. Shenoy), pp. 266–273. San Francisco, CA: Morgan Kaufmann. [.ps.gz] [CS] Janiszewski, C. 1998 The influence of display characteristics on visual exploratory behavior. J. Consumer Res. 25, 290–301. Jansen, B. J., Spink, A., Bateman, J. and Saracevic, T. 1998 Reallife information retrieval: a study of user queries on the Web. SIGIR Forum 32, 5–17. [.pdf] Jaynes, E. T. 1986 Bayesian methods: general background. In Maximum Entropy and Bayesian Methods in Statistics (ed. J. H Justice), pp. 1–25. Cambridge University Press. [.pdf] Jaynes, E. T. 2003 Probability Theory: The Logic of Science. Cambridge University Press. Jensen, F. V. 1996 An Introduction to Bayesian Networks. Springer. Jensen, F.V., Lauritzen, S. L. and Olesen, K. G. 1990 Bayesian updating in causal probabilistic networks by local computations. Comput. Statist. Q. 4, 269–282. Jeong, H., Tomber, B., Albert, R., Oltvai, Z. and Barabási, A.L. 2000 The largescale organization of metabolic networks. Nature 407, 651–654. [Pub] [CS] Joachims, T. 1997 A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In Proc. 14th Int. Conf. on Machine Learning, pp. 143–151. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Joachims, T. 1998 Text categorization with support vector machines: learning with many relevant features. In Proc. 10th European Conf. on Machine Learning, pp. 137–142. Springer. [CS] Joachims,T. 1999a Making largescale SVM learning practical In Advances in Kernel Methods: Support Vector Learning (ed. B. Schölkopf, C. J. C. Burges and A. J. Smola), pp. 169–184. Cambridge, MA; MIT Press. [.ps.gz] [CS] Joachims,T. 1999b Transductive inference for text classification using support vector machines. Proc. 16th Int. Conf. on Machine Learning (ICML), pp. 200–209. San Francisco, CA:Morgan Kaufmann. [.ps.gz] [CS] Joachims, T. 2002 Learning to Classify Text using Support Vector Machines. Kluwer. Jordan, M. I. (ed.) 1999 Learning in Graphical Models. Cambridge, MA: MIT Press. Jordan, M. I., Ghahramani, Z. and Saul, L. K. 1997 Hidden Markov decision trees. In Advances in Neural Information Processing Systems (ed. M. C. Mozer, M. I. Jordan and T. Petsche), vol. 9, pp. 501–507. Cambridge, MA: MIT Press. [.ps.gz] [CS] Jumarie, G. 1990 Relative information. Springer. Kask, K. and Dechter, R. 1999 Branch and bound with minibucket heuristics. Proceedings Int. Joint Conf. on Artificial Intelligence (IJCAI99), pp. 426–433. [.pdf] [CS] Kessler, M. 1963 Bibliographic coupling between scientific papers. Am. Documentat. 14, 10–25. Killworth, P. and Bernard, H. 1978 Reverse small world experiment. Social Networks 1, 159. Kira, K. and Rendell, L. A. 1992 A practical approach to feature selection. Proc. 9th Int. Conf. on Machine Learning, pp. 249–256. San Francisco, CA: Morgan Kaufmann. Kittler, J. 1986 Feature selection and extraction. In Handbook of Pattern Recognition and Image Processing (ed. T.Y.Young and K. S. Fu), ch. 3. Academic. Kleinberg, J. 1998 Authoritative sources in a hyperlinked environment. Proc. 9th Ann. ACM– SIAM Symp. on Discrete Algorithms, pp. 668–677. New York: ACM Press. (A preliminary version of this paper appeared as IBM Research Report RJ 10076, May 1997.) [.pdf] [CS] Kleinberg, J. 1999 Hubs, authorities, and communities. ACM Comput. Surv. 31, 5. [Pub] Kleinberg, J. 2000a Navigation in a small world. Nature 406, 845. [.pdf] Kleinberg, J. 2000b The smallworld phenomenon: an algorithmic perspective. Proc. 32nd ACM Symp. on the Theory of Computing. [.ps] [CS] Kleinberg, J. 2001 Smallworld phenomena and the dynamics of information. Advances in Neural Information Processing Systems (NIPS), vol. 14. Cambridge, MA: MIT Press. [Pub] [CS] Kleinberg, J. and Lawrence, S. 2001 The structure of the Web. Science 294, 1849–1850. [.pdf] Kleinberg, J., Kumar, R., Raghavan, P., Rajagopalan, S. and Tomkins, A. 1999 The Web as a graph: measurements, models, and methods. Proc. Int. Conf. on Combinatorics and Computing. Lecture notes in Computer Science, vol. 1627. Springer. [CS] Kohavi, R. and John, G. 1997 Wrappers for feature subset selection. Artif. Intel. 97, 273–324. [CS] Koller, D. and Sahami, M. 1997 Hierarchically classifying documents using very few words. Proc. 14th Int. Conf. on Machine Learning (ICML97), pp. 170–178. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Koller, D. and Sahami, N. 1996 Toward optimal feature selection. Proc. 13th Int. Conf. on Machine Learning, pp. 284–292. [.ps] [CS] Korte, C. and Milgram, S. 1978 Acquaintance networks between racial groups: application of the small world method. J. Pers. Social Psych. 15, 101. Koster, M. 1995 Robots in the Web: threat or treat? ConneXions 9(4). [.html] Krishnamurthy, B., Mogul, J. C. and Kristol, D. M. 1999 Key differences between HTTP/1.0 and HTTP/1.1. In Proc. 8th Int. WorldWide Web Conf. Elsevier. [.ps.gz] [CS] Kruger, A., Giles, C. L., Coetzee, F., Glover, E. J., Flake, G. W., Lawrence, S. and Omlin, C.W. 2000 DEADLINER: building a new niche search engine. In Proc. 2000 ACM–CIKM International Conf. on Information and Knowledge Management (CIKM00) (ed. A. Agah, J. Callan and E. Rundensteiner), pp. 272–281. New York: ACM Press. [Pub] [CS] Kullback, S. 1968 Information theory and statistics. New York: Dover. Kumar, S. R., Raghavan, P., Rajagopalan, S. and Tomkins, A. 1999a Extracting largescale knowledge bases from the Web. Proc. 25th VLDB Conf. , pp. 639–650. [.pdf] [CS] Kumar, S. R., Raghavan, P., Rajagopalan, S. and Tomkins, A. 1999b Trawling the Web for emerging cyber communities. In Proc. 8th World Wide Web Conf. Comp. Networks 31, 11–16. [.pdf] [CS] Kumar, S. R., Raghavan, P., Rajagopalan, S., Sivakumar, D., Tomkins, A. and Upfal, E. 2000 Stochastic models for the Web graph. Proc. 41st IEEE Ann. Symp. on the Foundations of Computer Science, pp. 57–65. [.pdf] [CS] Kushmerick, N., Weld, D. S. and Doorenbos, R. B. 1997 Wrapper induction for information extraction. In Proc. Int. Joint Conf. on Artificial Intelligence (IJCAI), pp. 729–737. [.ps.Z] [CS] Lafferty, J., McCallum, A. and Pereira, F. 2001 Conditional random fields: probabilistic models for segmenting and labeling sequence data. Proc. 18th Int. Conf. on Machine Learning, pp. 282–289. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Lam, W. and Ho, C.Y. 1998 Using a generalized instance set for automatic text categorization. In Proc. SIGIR98, 21st ACM Int. Conf. on Research and Development in Information Retrieval (ed. W. B. Croft, A. Moffat, C. J. van Rijsbergen, R. Wilkinson and J. Zobel), pp. 81–89. New York: ACM Press. [Pub] Lang, K. 1995 Newsweeder: Learning to filter news Proc. 12th Int. Conf. on Machine Learning (ed. A. Prieditis and S. J. Russell), pp. 331–339. San Francisco, CA: Morgan Kaufmann. Langley, P. 1994 Selection of relevant features in machine learning. Proc. AAAI Fall Symp. on Relevance, pp. 140–144. [.ps.gz] [CS] Lau, T. and Horvitz, E. 1999 Patterns of search: analyzing and modeling Web query refinement. Proc. 7th Int. Conf. on User Modeling, pp. 119–128. Springer. [.pdf] [CS] Lauritzen, S. L. 1996 Graphical Models. Oxford University Press. Lauritzen, S. L. and Spiegelhalter,D. J. 1988 Local computations with probabilities on graphical structures and their application to expert systems. J. Roy. Statist. Soc. B50, 157–224. Lawrence, S. 2001 Online or invisible? Nature 411, 521. [.pdf] [CS] Lawrence, S. and Giles, C. L. 1998a Context and page analysis for improved Web search. IEEE Internet Computing 2, 38–46. [.pdf] [CS] Lawrence, S. and Giles, C. L. 1998b Searching the World Wide Web. Science 280, 98–100. [.pdf] [CS] Lawrence, S. and Giles, C. L. 1999a Acccessibility of information on the Web. Nature 400, 107–109. [Pub] Lawrence, S., Giles, C. L. and Bollacker, K. 1999 Digital libraries and autonomous citation indexing. IEEE Computer 32, 67–71. [.pdf] [CS] Leek, T. R. 1997 Information extraction using hidden Markov models. Master’s thesis, University of California, San Diego. [.ps.gz] Lempel, R. and Moran, S. 2001 SALSA: the stochastic approach for linkstructure analysis. ACM Trans. Informat. Syst. 19, 131–160. [.ps] Letsche, T. A. and Berry, M. W. 1997 Largescale information retrieval with latent semantic indexing. Information Sciences 100, 105–137. [.html] [CS] Lewis, D. D. 1992 An evaluation of phrasal and clustered representations on a text categorization task. Proc. 15th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 37–50. New York: ACM Press. Lewis, D. D. 1997 Reuters21578 text categorization test collection. (Documentation and data available at http://www.daviddlewis.com/resources/testcollections/reuters21578/.) Lewis, D. D. 1998 Naive Bayes at forty: the independence assumption in information retrieval. Proc. 10th European Conf. on Machine Learning, pp. 4–15. Springer. [CS] Lewis, D. D. and Catlett, J. 1994 Heterogeneous uncertainty sampling for supervised learning. In Proc. ICML94, 11th Int. Conf. on Machine Learning (ed. W. W. Cohen and H. Hirsh), pp. 148–156. San Francisco, CA: Morgan Kaufmann. [CS] Lewis, D. D. and Gale, W. A. 1994 A sequential algorithm for training text classifiers. Proc. 17th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 3–12. Springer. [CS] Lewis, D. D. and Ringuette, M. 1994 Comparison of two learning algorithms for text categorization. In Proc. 3rd Ann. Symp. on Document Analysis and Information Retreval, pp. 81–93. [CS] Li, S., Montgomery, A., Srinivasan, K. and Liechty, J. L. 2002 Predicting online purchase conversion using Web path analysis. Graduate School of Industrial Administration, Carnegie Mellon University, Pittsburgh, PA. (Available from http://www.andrew.cmu.edu/&Mac247;alm3/ papers/purchase%20conversion.pdf.) Li, W. 1992 Random texts exhibit Zipf’slawlike word frequency distribution. IEEE Trans. Inform. Theory 38, 1842–1845. [.pdf] [CS] Lieberman, H. 1995 Letizia: An agent that assists Web browsing. In Proc. 14th Int. Joint Conf. on Artificial Intelligence (IJCAI95) (ed. C. S. Mellish), pp. 924–929. San Mateo, CA: Morgan Kaufmann. [.ps] [CS] Little, R. J. A. and Rubin, D. B. 1987 Statistical Analysis with Missing Data. John Wiley & Sons, Ltd/Inc. Liu, H. and Motoda, H. 1998 Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic. Lovins, J. B. 1968 Development of a stemming algorithm. Mech. Transl. Comput. Linguistics 11, 22–31. [An unofficial webpage about Lovin's stemmer is here] McCallum, A. and Nigam, K. 1998 A comparison of event models for naive Bayes text classification. AAAI/ICML98 Workshop on Learning for Text Categorization, pp. 41–48. Menlo Park, CA: AAAI Press. [.pdf] [CS] McCallumA., Freitag, D. and Pereira, F. 2000a Maximum entropy Markov models for information extraction and segmentation. Proc. 17th Int. Conf. on Machine Learning, pp. 591–598. San Francisco, CA: Morgan Kaufmann. [.ps.gz] [CS] McCallum, A., Nigam, K. and Ungar, L. H. 2000b Efficient clustering of highdimensional data sets with application to reference matching. Proc. 6th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pp. 169–178. New York: ACM Press. [.ps.gz] [CS] McCallum, A. K., Nigam, K., Rennie, J. and Seymore, K. 2000c Automating the construction of Internet portals with machine learning. Information Retrieval 3, 127–163. [.ps.gz] [CS] McCann, K., Hastings, A. and Huxel, G. R. 1998 Weak trophic interactions and the balance of nature. Nature 395, 794–798. McClelland, J. L. and Rumelhart, D. E. 1986 Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Cambridge, MA: MIT Press. McEliece, R. J. 1977 The Theory of Information and Coding. Reading, MA: AddisonWesley. McEliece, R. J. and Yildirim, M. 2002 Belief propagation on partially ordered sets. In Mathematical Systems Theory in Biology, Communications, and Finance (ed. D. Gilliam and J. Rosenthal). Institute for Mathematics and its Applications, University of Minnesota. [.pdf] McEliece, R. J., MacKay, D. J. C. and Cheng, J. F. 1997 Turbo decoding as an instance of Pearl’s ‘belief propagation’ algorithm. IEEE J. Select. Areas Commun. 16, 140–152. [.pdf] MacKay, D. J. C. and Peto, L. C. B. 1995a A hierarchical Dirichlet language model. Natural Language Engng 1, 1–19. [.ps.gz] [CS] McLachlan, G. and Peel, D. 2000 Finite Mixture Models. John Wiley & Sons, Ltd/Inc. Mahmoud, H. M. and Smythe, R. T. 1995 A survey of recursive trees. Theory Prob. Math. Statist. 51, 1–27. Manber, U. and Myers, G. 1990 Suffix arrays: a new method for online string searches. Proc. 1st Ann. ACM–SIAM Symp. on Discrete Algorithms, pp. 319–327. Philadelphia, PA: Society for Industrial and Applied Mathematics. [.pdf] Mandelbrot, B. 1977 Fractals: Form, Chance, and Dimension. New York: Freeman. Marchiori, M. 1997 The quest for correct information on the Web: hyper search engines. In Proc. 6th Int. WorldWide Web Conf., Santa Clara, CA. Comp. Networks 29, 1225–1235. [.html] Mark, E. F. 1988 Searching for information in a hypertext medical handbook. Commun ACM 31, 880–886. Maron, M. E. 1961 Automatic indexing: an experimental inquiry. J. ACM 8, 404–417. Maslov, S. and Sneppen, K. 2002 Specificity and stability in topology of protein networks. Science 296, 910–913. Melnik, S., Raghavan, S.,Yang, B. and GarciaMolina, H. 2001 Building a distributed fulltext index for the Web. ACM Trans. Informat. Syst. 19, 217–241. [.pdf] Mena, J. 1999 Data Mining your Website. Boston, MA: Digital Press. Menczer, F. 1997 ARACHNID: adaptive retrieval agents choosing heuristic neighborhoods for information discovery. Proc. 14th Int. Conf. on Machine Learning, pp. 227–235. San Francisco, CA: Morgan Kaufmann. [.ps] [CS] Menczer, F. and Belew, R. K. 2000 Adaptive retrieval agents: internalizing local context and scaling up to the Web. Machine Learning 39, 203–242. [.pdf] [CS] Milgram, S. 1967 The small world problem. Psychology Today 1, 61. Milo, R., ShenOrr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D. and Alon, U. 2002 Network motifs: simple building blocks of complex networks. Science 298, 824–827. [.pdf] Mitchell, T. 1997 Machine Learning. McGrawHill. Mitzenmacher, M. 2002 A brief history of generative models for power law and lognormal distributions. Technical Report, Harvard University, Cambridge, MA. [.pdf] [CS] Moffat, A. and Zobel, J. 1996 Selfindexing inverted files for fast text retrieval. ACM Trans. Informat. Syst. 14, 349–379. [Pub] [CS] Montgomery,A. L. 2001 Applying quantitative marketing techniques to the Internet. Interfaces 30, 90–108. [.pdf] Mooney, R. J. and Roy, L. 2000 Contentbased book recommending using learning for text categorization. Proc. 5th ACM Conf. on Digital Libraries, pp. 195–204. New York: ACM Press. [.pdf] [CS] Mori, S., Suen, C. andYamamoto, K. 1992 Historical review of OCR research and development. Proc. IEEE 80, 1029–1058. Moura, E. S., Navarro, G. and Ziviani, N. 1997 Indexing compressed text. In Proc. 4th South American Workshop on String Processing (ed. R. BaezaYates), International Informatics Series, pp. 95–111. Ottawa: Carleton University Press. [.ps] [CS] Najork, M. andWiener, J. 2001 Breadthfirst search crawling yields highquality pages. Proc. 10th Int. World Wide Web Conf., pp. 114–118. Elsevier. [Pub] Neal, R. M. 1992 Connectionist learning of belief networks. Artif. Intel. 56, 71–113. NevilleManning, C. and Reed, T. 1996 A PostScript to plain text converter. Technical report. (Available from http://www.nzdl.org/html/prescript.html.) Newman, M. E. J., Moore, C. and Watts, D. J. 2000 Meanfield solution of the smallworld network model. Phys. Rev. Lett. 84, 3201–3204. [.pdf] [CS] Ng, A.Y. and Jordan, M. I. 2002 On discriminative vs generative classifiers: a comparison of logistic regression and naive Bayes. Advances in Neural Information Processing Systems 14. Proc. 2001 Neural Information Processing Systems (NIPS) Conference. MIT Press. [.pdf] Ng, A. Y., Zheng, A. X. and Jordan, M. I. 2001 Stable algorithms for link analysis. Proc. 24th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 258–266. New York: ACM Press. [.pdf] [CS] Nigam, K. and Ghani, R. 2000 Analyzing the effectiveness and applicability of cotraining. In Proc. 2000 ACM–CIKM Int. Conf. on Information and Knowledge Management (CIKM00) (ed. A. Agah, J. Callan and E. Rundensteiner), pp. 86–93. New York: ACM Press. [.pdf] [CS] Nigam, K., McCallum A., Thrun, S. and Mitchell, T. 2000 Text classification from labeled and unlabeled documents using EM. Machine Learning 39, 103–134. [.pdf] [CS] Nothdurft, H. 2000 Salience from feature contrast: additivity across dimensions. Vision Res. 40, 1183–1201. Olshausen, B. A., Anderson, C. H. and Essen, D. C. V. 1993 A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. J. Neurosci. 13, 4700–4719. Oltvai, Z. N. and Barabási, A.L. 2002 Life’s complexity pyramid. Science 298, 763–764. [.pdf] O’Neill, E. T., McClain P. D. and Lavoie, B. F. 1997 A methodology for sampling the World Wide Web Annual Review of OCLC Research. (Available from http://www.oclc.org/research/publications/arr/1997/oneill/o'neillar980213.htm.) Page, L., Brin, S., Motwani, R. andWinograd, T. 1998 The PageRank citation ranking: bringing order to the Web. Technical report, Stanford University. (Available at http://wwwdb.stanford.edu/xbackrub/pageranksub.ps.) Paine, R. T. 1992 Foodweb analysis through field measurements of per capita interaction strength. Nature 355, 73–75. Pandurangan, G., Raghavan, P. and Upfal, E. 2002 Using PageRank to characterize Web structure. Proc. 8th Ann. Int. Computing and Combinatorics Conf. (COCOON). Lecture Notes in Computer Science, vol. 2387, p. 330. Springer. [.ps.gz] [CS] Papineni, K. 2001 Why inverse document frequency? Proc. North American Association for Computational Linguistics, pp. 25–32. [Pub] [CS] Passerini, A., Pontil, M. and Frasconi, P. 2002 From margins to probabilities in multiclass learning problems. In Proc. 15th European Conf. on Artificial Intelligence (ed. F. van Harmelen). Frontiers in Artificial Intelligence and Applications Series. Amsterdam: IOS Press. [.pdf] [CS] Pazzani, M. 1996 Searching for dependencies in Bayesian classifiers. In Proc. 5th Int.Workshop on Artificial Intelligence and Statistics, pp. 239–248. Springer. [CS] Pearl, J. 1988 Probabilistic reasoning in intelligent systems. San Mateo, CA: Morgan Kaufmann. Pennock, D. M., Flake, G. W., Lawrence, S., Glover, E. J. and Giles, C. L. 2002 Winners don’t take all: characterizing the competition for links on the Web. Proc. Natl Acad. Sci. 99, 5207–5211. [.ps] [CS] Perline, R. 1996 Zipf’s law, the central limit theorem, and the random division of the unit interval. Phys. Rev. E 54, 220–223. Pew Internet Project Report 2002 Search engines. (Available at http://www.pewinternet.org/ reports/toc.asp?Report=64.) [.pdf] Phadke, A. G. and Thorp, J. S. 1988 Computer Relaying for Power Systems. John Wiley & Sons, Ltd/Inc. Philips, T. K., Towsley, D. F. andWolf, J. K. 1990 On the diameter of a class of random graphs. IEEE Trans. Inform. Theory 36, 285–288. Pimm, S. L., Lawton, J. H. and Cohen, J. E. 1991 Food web patterns and their consequences. Nature 350, 669–674. Pittel, B. 1994 Note on the heights of random recursive trees and random mary search trees. Random Struct. Algorithms 5, 337–347. Platt, J. 1999 Fast training of support vector machines using sequential minimal optimization. In Advances in Kernel Methods – SupportVector Learning (ed. B. Schölkopf, C. J. C. Burges and A. J. Smola,), pp. 185–208. Cambridge, MA: MIT Press. Popescul, A., Ungar, L. H., Pennock, D. M. and Lawrence, S. 2001 Probabilistic models for unified collaborative and contentbased recommendation in sparsedata environments. Proc.17th Int. Conf. on Uncertainty in Artificial Intelligence, pp. 437–444. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Porter, M. 1980 An algorithm for suffix stripping. Program 14, 130–137. Quinlan, J. R. 1986 Induction of decision trees. Machine Learning 1, 81–106. Quinlan, J. R. 1990 Learning logical definitions from relations. Machine Learning 5, 239–266. Rafiei, D. and Mendelzon, A. 2000 What is this page known for? Computing Web page reputations. Proc. 9th World Wide Web Conf. [Pub] Raggett, D., Hors, A. L. and Jacobs, I. (eds) 1999 HTML 4.01 Specification. W3 Consortium Recommendation. (Available from http://www.w3.org/TR/html4/.) Raskinis, I. M. G. and Ganascia, J. 1996 Text categorization: a symbolic approach. Proc. 5th Ann. Symp. on Document Analysis and Information Retrieval. New York: ACM Press. Redner, S. 1998 How popular is your paper? An empirical study of the citation distribution. Euro. Phys. J. B4, 131–134. [.ps] Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P. and Riedl, J. 1994 GroupLens: an open architecture for collaborative filtering of netnews. Proc. 9th ACM Conf. on Computer Supported Cooperative Work, pp. 175–186. New York: ACM Press. [.pdf] [CS] Ripeanu, M., Foster, I. and Iamnitchi, A. 2002 Mapping the Gnutella network: properties of largescale peertopeer systems and implications for system design. IEEE Internet Comput. J. 6, 99–100. [.pdf] [CS] Roberts, M. J. and Mahesh, S. M. 1999 Hotmail. Technical report, Harvard University, Cambridge, MA. Case 899185, Harvard Business School Publishing. Robertson, S. E. 1977 The probability ranking principle in IR. J. Documentation 33, 294–304. (Also reprinted in Jones and Willett (1997), pp. 281–286.) Robertson, S. E. and Spärck Jones, K. 1976 Relevance weighting of search terms. J. Am. Soc. Informat. Sci. 27, 129–146. Robertson, S. E. and Walker, S. 1994 Some simple effective approximations to the 2Poisson model for probabilistic weighted retrieval. Proc. 17th Ann. Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 232–241. Springer. Rosenblatt, F. 1958 The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408. Ross, S. M. 2002 Probability Models for Computer Science. San Diego, CA: Adademic Press. Russell, S. and Norvig, P. 1995 Artificial Intelligence: A Modern Approach. Prentice Hall. Sahami, M., Dumais, S., Heckerman, D. and Horvitz, E. 1998 A Bayesian approach to filtering junk email. AAAI98 Workshop on Learning for Text Categorization, pp. 55–62. [.ps] [CS] Salton, G. 1971 TheSMARTRetrieval System: Experiments in Automatic Document Processing. Englewood Cliffs, NJ: Prentice Hall. Salton, G. and McGill, M. J. 1983 Introduction to modern information retrieval. McGrawHill. Salton, G., Fox, E. A. andWu, H. 1983 Extended boolean information retrieval. Commun. ACM 26, 1022–1036. Sarukkai, R. R. 2000 Link prediction and path analysis using Markov chains. Comp. Networks 33, 377–386. [Pub] Sarwar, B. M., Karypis, G., Konstan, J. A. and Riedl, J. T. 2000 Analysis of recommender algorithms for ecommerce. Proc. 2nd ACM Conf. on Electronic Commerce, pp. 158–167. New York: ACM Press. Saul, L. and Pereira, F. 1997 Aggregate and mixedorder Markov models for statistical language processing. In Proc. 2nd Conf. on Empirical Methods in Natural Language Processing (ed. C. Cardie and R. Weischedel), pp. 81–89. Somerset, NJ: Association for Computational Linguistics. [.pdf] [CS] Saul, L. K. and Jordan, M. I. 1996 Exploiting tractable substructures in intractable networks. In Advances in Neural Information Processing Systems (ed. D. S. Touretzky, M. C. Mozer and M. E Hasselmo), vol. 8, pp. 486–492.Cambridge, MA: MIT Press. [.ps.gz] Savage, L. J. 1972 The foundations of statistics. New York: Dover. Schafer, J. B., Konstan, J. A. and Riedl, J. 2001 Ecommerce recommendation applications. J. Data Mining Knowl. Discovery 5, 115–153. [CS] Schapire, R. E. and Freund,Y. 2000 Boostexter: a boostingbased system for text categorization. Machine Learning 39, 135–168. [.pdf] [CS] Schoelkopf, B. and Smola, A. 2002 Learning with Kernels. Cambridge, MA: MIT Press. Sebastiani, F. 2002 Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47. [.pdf] [CS] Sen, R. and Hansen, M. H. 2003 Predicting a Web user’s next request based on log data. J. Computat. Graph. Stat. (In the press.) [.pdf] Seneta, E. 1981 Nonnegative Matrices and Markov Chains. Springer. Shachter, R. D. 1988 Probabilistic inference and influence diagrams. Oper. Res. 36, 589–604. Shachter, R. D., Anderson, S. K. and Szolovits, P. 1994 Global conditioning for probabilistic inference in belief networks. Proc. Conf. on Uncertainty in AI, pp. 514–522. San Francisco, CA: Morgan Kaufmann. [.pdf] [CS] Shahabi, C., BanaeiKashani, F. and Faruque, J. 2001 A framework for efficient and anonymous Web usage mining based on clientside tracking. In Proceedings of WEBKDD 2001. Lecture Notes in Artificial Intelligence, vol. 2356, pp. 113–144. Springer. [.pdf] Shannon, C. E. 1948a A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423. Shannon, C. E. 1948b A mathematical theory of communication. Bell Syst.Tech. J. 27, 623–656. Shardanand, U. and Maes, P. 1995 Social information filtering: algorithms for automating ‘word of mouth’. Proc. Conf. on Human Factors in Computing Systems, pp. 210–217. [.pdf] [CS] Shore, J. E. and Johnson, R.W. 1980 Axiomatic derivation of the principle of maximum entropy and the principle of minimum crossentropy. IEEE Trans. Inform. Theory 26, 26–37. Silverstein, C., Henzinger, M., Marais, H. and Moricz, M. 1998 Analysis of a very large AltaVista query log. Technical Note 199814, Digital System Research Center, Palo Alto, CA. [.ps.gz] [CS] Slonim, N. and Tishby, N. 2000 Document clustering using word clusters via the information bottleneck method. Proc. 23rd Int. Conf. on Research and Development in Information Retrieval, pp. 208–215. New York: ACM Press. [.ps.gz] [CS] Slonim, N., Friedman, N. and Tishby, N. 2002 Unsupervised document classification using sequential information maximization. Proc. 25th Int. Conf. on Research and Development in Information Retrieval, pp. 208–215. New York: ACM Press. [.ps.gz] [CS] Small, H. 1973 Cocitation in the scientific literature: A new measure of the relationship between two documents. J. Am. Soc. Inf.Sci. 24, 265–269. [.pdf] Smyth, P., Heckerman, D. and Jordan, M. I. 1997 Probabilistic independence networks for hidden Markov probability models. Neural Comp. 9, 227–267. [.pdf] [CS] Soderland, S. 1999 Learning information extraction rules for semistructured and free text. Machine Learning 34, 233–272. [.ps] [CS] SperbergMcQueen, C. and Burnard, L. (eds) 2002 TEI P4: Guidelines for Electronic Text Encoding and Interchange. Text Encoding Initiative Consortium. (Available from http://www.teic.org/.) Spink, A., Jansen, B. J.,Wolfram, D. and Saracevic, T. 2002 From esex to ecommerce: Web search changes. IEEE Computer 35, 107–109. [.pdf] Sutton, R. S. and Barto, A. G. 1998 Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press. Tan, P. and Kumar, V. 2002 Discovery of Web robot sessions based on their navigational patterns. Data Mining Knowl. Discov. 6, 9–35. [.ps.gz] [CS] Tantrum, J., Murua, A. and Stuetzle, W. 2002 Hierarchical modelbased clustering of large datasets through fractionation and refractionation. Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining. New York: ACM Press. [.pdf] Taskar, B., Abbeel, P. and Koller, D. 2002 Discriminative probabilistic models for relational data. Proc. 18th Conf. on Uncertainty in Artificial Intelligence. San Francisco, CA: Morgan Kaufmann. [.ps] Tauscher, L. and Greenberg, S. 1997 Revisitation patterns in World Wide Web navigation. Proc. Conf. on Human Factors in Computing Systems CHI’97, pp. 97–137. New York:ACM Press. [.pdf] [CS] Tedeschi, B. 2000 Easier to use sites would help etailers close more sales. New York Times, 12 June 2000. Tishby, N., Pereira, F. and Bialek,W. 1999 The information bottleneck method. In Proc. 37th Ann. Allerton Conf. on Communication, Control, and Computing (ed. B. Hajek and R. S. Sreenivas), pp. 368–377. [.pdf] [CS] Titterington, D. M., Smith,A. F. M. and Makov,U. E. 1985 Statistical Analysis of Finite Mixture Distributions. John Wiley & Sons, Ltd/Inc. Travers, J. and Milgram, S. 1969 An experimental study of the smal world problem. Sociometry 32, 425. Ungar, L. H. and Foster, D. P. 1998 Clustering methods for collaborative filtering. In Proc. Workshop on Recommendation Systems at the 15th National Conf. on Artificial Intelligence. Menlo Park, CA: AAAI Press. [.ps] [CS] Vapnik, V. N. 1982 Estimation of Dependences Based on Empirical Data. Springer. Vapnik, V. N. 1995 The Nature of Statistical Learning Theory. Springer. Vapnik, V. N. 1998 Statistical Learning Theory. John Wiley & Sons, Ltd/Inc. Viterbi, A. J. 1967 Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans. Inform. Theory 13, 260–269. Walker, J. 2002 Links and power: the political economy of linking on the Web. Proc. 13th Conf. on Hypertext and Hypermedia, pp. 72–73. New York: ACM Press. [.html] [CS] Wall, L., Christiansen, T. and Schwartz RL. 1996 Programming Perl, 2nd edn. Cambridge, MA: O’Reilly & Associates. Wasserman, S. and Faust, K. 1994 Social Network Analysis. Cambridge University Press. Watts, D. J. and Strogatz, S. H. 1998 Collective dynamics of ‘smallworld’ networks. Nature 393, 440–442. [.pdf] Watts, D. J., Dodds, P. S. and Newman, M. E. J. 2002 Identity and search in social networks. Science 296, 1302–1305. [.pdf] Weiss, S. M., Apte, C., Damerau, F. J., Johnson, D. E., Oles, F. J., Goetz, T. and Hampp, T. 1999 Maximizing textmining performance. IEEE Intell. Syst. 14, 63–69. [.pdf] Weiss, Y. 2000 Correctness of local probability propagation in graphical models with loops. Neural Comp. 12, 1–41. [.pdf] White, H. 1970 Search parameters for the small world problem. Social Forces 49, 259. Whittaker, J. 1990 Graphical Models in Applied Multivariate Statistics. John Wiley & Sons, Ltd/Inc. Wiener, E. D., Pedersen, J. O. and Weigend, A. S. 1995 A neural network approach to topic spotting. Proc. SDAIR95, 4th Ann. Symp. on Document Analysis and Information Retrieval, Las Vegas, NV, pp. 317–332. [CS] Witten, I. H., Moffat, A. and Bell, T. C. 1999 Managing Gigabytes: Compressing and Indexing Documents and Images, 2nd edn. San Francisco, CA: Morgan Kaufmann. Witten, I. H., NevilleManning, C. and Cunningham, S. J. 1996 Building a digital library for computer science research: technical issues. Proc. Australasian Computer Science Conf., Melbourne, Australia. Wolf, J., Squillante, M.,Yu, P., Sethuraman, J. and Ozsen, L. 2002 Optimal crawling strategies for Web search engines. Proc. 11th Int. World Wide Web Conf., pp. 136–147. [pub] Xie, Y. and O’Hallaron, D. 2002 Locality in search engine queries and its implications for caching. Proc. IEEE Infocom 2002, pp. 1238–1247. Piscataway, NJ: IEEE Press. [.pdf] [CS] Yang, Y. 1999 An evaluation of statistical approaches to text categorization. Information Retrieval 1, 69–90. [.ps.gz] [CS] Yang,Y. and Liu, X. 1999 A reexamination of text categorization methods In Proc. SIGIR99, 22nd ACM Int. Conf. on Research and Development in Information Retrieval (ed. M. A. Hearst, F. Gey and R. Tong), pp. 42–49. New York: ACM Press. [.ps.gz] [CS] Yedidia, J., Freeman,W. T. and Weiss,Y. 2000 Generalized belief propagation. Neural Comp. 12, 1–41. [.pdf] York, J. 1992 Use of the Gibbs sampler in expert systems. Artif. Intell. 56, 115–130. Zamir, O. and Etzioni, O. 1998 Web document clustering: a feasibility demonstration. Proc 21st Int. Conf. on Research and Development in Information Retrieval (SIGIR), pp. 46–54. New York: ACM Press. [.pdf] [CS] Zelikovitz, S. and Hirsh, H. 2001 Using LSI for text classification in the presence of background text. Proc. 10th Int. ASM Conf. on Information and Knowledge Management, pp. 113–118. New York: ACM Press. [.pdf] [CS] Zhang, T. and Iyengar, V. S. 2002 Recommender systems using linear classifiers. J. Machine Learn. Res. 2, 313–334. [Pub] Zhang, T. and Oles, F. J. 2000 A probability analysis on the value of unlabeled data for classification problems. Proc. 17th Int. Conf. on Machine Learning, Stanford, CA, pp. 1191–1198. [.ps] Zhu, X., Yu, J. and Doyle, J. 2001 Heavy tails, generalized coding, and optimal Web layout. Proc. 2001 IEEE INFOCOM Conf., vol. 3, pp. 1617–1626. Piscataway, NJ: IEEE Press. [.ps] [CS] Zukerman, I., Albrecht, D. W. and Nicholson, A. E. 1999 Predicting users' requests on the WWW. Proc. UM99: 7th Int. Conf. on User Modeling, pp. 275–284. Springer. [.pdf] 

Compiled by Paolo Frasconi  Feel free to contact me for corrections/additions