Graphical Interfaces to Support Information Search

Main | Overviews | Classics | Demo Potential | Other systems | User-testing | Bibliographies

Classics of Information Retrieval Visualization


Scatter/Gather

Scatter/Gather

Hearst, M.A., & Pedersen, J.O. (1996) Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results. In Frei, H.P. et al., (Eds.) Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '96) (pp 76-84).


.pdf available from ACM-SIGIR'96

Author's Abstract
We present Scatter/Gather, a cluster-based document browsing method, as an alternative to ranked titles for the organization and viewing of retrieval results. We systematically evaluate Scatter/Gather in this context and find significant improvements over similarity search ranking alone. This result provides evidence validating the cluster hypothesis which states that relevant documents tend to be more similar to each other than non-relevant documents. We demonstrate that users are able to use this system close to its full potential.

Additional Comments
Scatter/Gather is a cluster-based document browsing method, an alternative to ranked titles for the organization and viewing of retrieval results. Clustering in Scatter/Gather is dynamic, created based only on the top 250 top-scoring documents from a query. The algorithm divides the retrieved documents into 5 clusters, and lists the common keywords for the articles at the top. By examining these identifying keywords for each cluster, and considering the number of documents in each cluster, the user can determine which of the document clusters best fits their information need. The user study and experimental study both supported clustering as a useful and usable tool. Even though this is not a graphical interface, the underlying framework is important to graphical interfaces, especially since concepts like "scatter," "gather" and "clustering" are all visual metaphors for text-based presentations.


TileBars

TileBars

TileBars: Visualization of Term Distribution Information in Full Text Information Access

http://www.acm.org/sigchi/chi95/Electronic/documnts/papers/mah_bdy.htm

Marti A. Hearst, Xerox Palo Alto Research Center

Author's Abstract
The field of information retrieval has traditionally focused on textbases consisting of titles and abstracts. As a consequence, many underlying assumptions must be altered for retrieval from full-length text collections. This paper argues for making use of text structure when retrieving from full text documents, and presents a visualization paradigm, called TileBars, that demonstrates the usefulness of explicit term distribution information in Boolean-type queries. TileBars simultaneously and compactly indicate relative document length, query term frequency, and query term distribution. The patterns in a column of TileBars can be quickly scanned and deciphered, aiding users in making judgments about the potential relevance of the retrieved documents.

Additional Comments
TileBars is frequently cited in many information visualization systems. The traditional ranked list of results is supplemented by a graphical representation of the article's correspondence to the search query.


Tkinq

Tkinq

Querying, Navigating and Visualizing a Digital Library Catalog


http://www.csdl.tamu.edu/DL95/papers/veerasamy/veerasamy.html

Aravindan Veerasamy, Shamkant Navathe, College of Computing, Georgia Institute of Technology

Author's Abstract
We describe the design of an User Interface for a ranked output Information Retrieval system that integrates querying, navigation and visualization in a seamless fashion. Highlights of the system include the following:

By providing a rich set of features, the interface coherently supports a wide spectrum of information gathering tactics for different classes of users.

Additional Comments
This system is an earlier version of the more widely cited system by Veerasamy. The name "Tkinq" is rarely cited, although Veerasamy's name appears in many bibliographies and the system is quite original and potentially useful as a visualization tool.


Tkinq

Veerasamy, A. & Belkin, N.J. (1996). Evaluation of a Tool for Visualization of Information Retrieval Results. In Frei, H.P. et al., (Eds.) Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '96) (pp 85-92). New York : ACM.

.pdf available from ACM-SIGIR '96

Aravindan Veerasamy, College of Computing, Georgia Institute of Technology
Nicholas J. Belkin, School of Communication, Information & Library Studies, Rutgers University

Author's Abstract
We report on the design and evaluation of a visualization tool for Information Retrieval (IR) systems that aims to help the end user in the following respects:

Two different experiments using TREC-4 data were conducted to evaluate the effectiveness of this tool. Results, while mixed, indicate that visualization of this sort may provide useful support for judging the relevance of documents, in particular by enabling users to make more accurate decisions about which documents to inspect in detail. Problems in evaluation of such tool in interactive environments are discussed.

Additional Comments
This concise yet information dense graphical display of relevancy results from queries is novel and effective. "The presence or absence of specific significant words in any document can be quickly seen, and it is possible to identify sequences of documents which do, or do not have important contributions from specific query words." This interface differs from the much-cited Tilebars system because the graphical results display the characteristics of many whole documents simultaneously, rather than focusing (as Tilebars) on characteristics of individual documents in a set.


Veerasamy, A. & Heikes, R. (1997). Effectiveness of a graphical display of retrieval results. In Belkin, N.J. et al., (Eds.) Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '97) (pp 236-245). New York : ACM.


.pdf available from ACM-SIGIR '97 (but no illustrations)

Aravindan Veerasamy, College of Computing,Georgia Institute of Technology
Russell Heikes, Statistics Center, School of Industrial Systems and Engineering, Georgia Institute of Technology

Author's Abstract
We present the design of a visualization tool that graphically displays the strength of query concepts in the retrieved documents. Graphically displaying document surrogate information enables set-at-a-time perusal of documents, rather than document-at-a-time perusal of textual displays. By providing additional relevance information about the retrieved documents, the tool aids the user in accurately identifying relevant documents. Results of an experiment evaluating the tool shows that when users have the toll they are able to identify relevant documents in a shorter period of time than without the tool, and with increased accuracy. We have evidence to believe that appropriately designed graphical displays can enable users to better interact with the system.

Additional Comments
This system attempts to present information retrieval results in a novel and helpful way to save the user both time and effort in finding relevant material (and eliminating irrelevant material). A combination of a visualization of result relevance with a title display allows the user to skim only those titles which seem relevant from the visual summary. The system was experimentally user-tested and found to be more efficient than a traditional title display. References to similar visualization systems and a comparison to TileBars are included.


BEAD

BEAD

Chamlers, M. & Chitson, P. (1992). Bead: Explorations in Information Visualization. In W.B. Croft, et al. (Eds.), Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. (pp 353-354).


.pdf available from the ACM Digital Library - SIGIR'92 or

Matthew Chalmers and Paul Chitson, Rank Xerox Cambridge EuroPARC

Author's Abstract
We describe work on the visualization of bibliographic data and, to aid in this task, the application of numerical techniques for multidimensional scaling.
Many areas of scientific research involve complex multivariate data. One example of this is Information Retrieval. Document comparisons may be done using a large number of variables. Such conditions do not favour the more well-known methods of visualization and graphical analysis, as it is rarely feasible to map each variable onto one aspect of even a three-dimensional, coloured and textured space.
Bead is a prototype system for the graphically-based exploration of information. In this system, articles in a bibliography are represented by particles in 3-space. By using physically-based modelling techniques to take advantage of fast methods for the approximation of potential fields, we represent the relationships between articles by their relative spatial positions. Inter-particle forces tend to make similar articles move closer to one another and dissimilar ones move apart. The result is a 3D scene which can be used to visualize patterns in the high-D information space.

Additional Comments
BEAD is one of the first graphical prototypes for information retrieval results. This innovative system inspired much future research and developments on systems for displaying retrieved documents graphically. While the display is not impressive by today's graphical standards, BEAD is a truly novel interface.


Similar to BEAD

Similar to BEAD

Leouski, A. & Allan, J. (1998). Visual Interactions with a Multidimensional Ranked List. In W.B. Croft, et al. (Eds.), Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. (pp 353-354).


.pdf available from the ACM Digital Library - SIGIR'98 or
http://hobart.cs.umass.edu/~allan/Papers/sigir98c.ps (recommended)

Anton Leouski and James Allan, Center for Intelligent Information Retrieval
Department of Computer Science, University of Massachusetts

Author's Abstract
none

Additional Comments
This study investigates an interactive visualization technique where retrieved documents are placed in a 3-dimensional space and positioned according to the similarity among them. "Although our system does not explicitly create any clusters, we observed that relevant documents tend to appear in close proximity to each other, often forming tight "clumps" that stand apart from the rest of the material. Two added features incorporate the user's feedback into the visualization (after the user has marked some documents as relevant or non-relevant). "Warping" makes known relevant objects move closer together and attract other relevant material. "Restraining" makes known relevant and non-relevant objects move apart, and the rest of the documents "stretch" between these two groups. The system in the study can visualize documents in 1, 2 and 3 dimensions. "We observed that 1-dimensional visualization was generally inferior to higher dimensional presentation, however we found almost no difference between 2- and 3- dimensional pictures." Some user testing is implied but not described in the wording of the brief report, and no additional information has been published.


Narcissus

Narcissus

R. J. Hendley et. al., "Narcissus: Visualizing Information," in Proceedings Of Information Visualization Symposium 1995, N. Gershon and S.G.Eick, ed., IEEE CS Press, Los Alamitos, Calif., 1995, pp. 90-96. Rohrer, R.M., Sibert, J. & Ebert, D.S. (1999).


.pdf available from IEEE Digital Library

R.J. Hendley, N.S.Drew, A.M.Wood & R. Beale
University of Birmingham, UK

Author's Abstract
It is becoming increasingly important that support is provided for users who are dealing with complex information spaces. The need is driven by the growing number of domains where there is a requirement for users to understand, navigate and manipulate large sets of computer based data; by the increasing size and complexity of this information and by the pressures to use this information efficiently. The paradigmatic example is the World Wide Web, but other domains include software systems, information systems and concurrent engineering. One approach to providing this support is to provide sophisticated visualization tools which will lead the users to form an intuitive understanding of the structure and behavior of their domain and which will provide mechanisms which allow them to manipulate objects within their systems. This paper describes such a tool and a number of visualisation techniques that it implements.


VIBE

Kai A. Olsen, Robert R. Korfhage, Kenneth M. Sochats, Michael B. Spring, James G. Williams. "Visualisation of a Document Collection: The VIBE System, Information Processing and Management, 29 (1), 6981, 1993.


VR-VIBE

VR-VIBE

Technology Pushes in the Research of the CRG: Information Visualization: VR-VIBE


Communications Research Group, Department of Computer Science, University of Nottingham http://www.crg.cs.nott.ac.uk/research/technologies/visualisation/vrvibe/

Additional Comments
This site includes several screen shots of the system with descriptions. The VR-VIBE system is applied to collaborative visualizations in a project called Populated Information Terrains, which is described at http://www.crg.cs.nott.ac.uk/research/applications/pits/.

VR-VIBE

Collaborative Visualization of Large Scale Hypermedia Databases


http://www.crg.cs.nott.ac.uk/~ccb/papers/GMD-paper.html

Chris Brown, Steve Benford and Dave Snowdon, Communications Research Group, University of Nottingham

Author's Abstract
We begin by reviewing techniques for visualizing large scale hypermedia databases. We present a definition of large scale databases, introduce a scoping technique to handle them, and discuss collaboration support. This leads to a discussion of the implementation; we discuss browsing and searching, and the embodiment of database users in the visualization. Finally, we present an example application of these techniques: The Internet Foyer.


LifeLines

LifeLines

Plaisant, C., Milash, B., Rose, A., Widoff, S., Shneiderman, B. (September 1995) Life Lines: Visualizing personal histories ACM CHI '96 Conference Proc. (pp 221-227)


ftp://ftp.cs.umd.edu/pub/hcil/Reports-Abstracts-Bibliography/95-15html/cps1txt.html

.pdf available from the ACM Digital Library - SIGCHI '96

Author's Abstract
LifeLines provide a general visualization environment for personal histories that can be applied to medical and court records, professional histories and other types of biographical data. A one screen overview shows multiple facets of the records. Aspects, for example medical conditions or legal cases, are displayed as individual time lines, while icons indicate discrete events, such as physician consultations or legal reviews. Line color and thickness illustrate relationships or significance, rescaling tools and filters allow users to focus on part of the information. LifeLines reduce the chances of missing information, facilitate spotting anomalies and trends, streamline access to details, while remaining tailorable and easily transferable between applications. The paper describes the use of LifeLines for youth records of the Maryland Department of Juvenile Justice and also for medical records. User's feedback was collected using a Visual Basic prototype for the youth record.

Additional Comments
Additional publications are available at http://www.cs.umd.edu/hcil/lifelines/


Cat-A-Cone

Hearst, M.A. & Karadi, C. (1997). Cat-A-Cone: An Interactive Interface for Specifying Searches and Viewing Retrieval Results using a Large Category Hierarchy. In Belkin, N.J. et al., (Eds.) Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '97) (pp 236-245). New York : ACM.


.pdf available from ACM-SIGIR '97 (but no illustrations)
Marti .A. Hearst, Xerox Palo Alto Research Center
Chandu Karadi, School of Medicine, Stanford University

Author's Abstract
This paper introduces a novel user interface that integrates search and browsing of very large category hierarchies with their associated text collections. A key component is the separate but simultaneous display of the representations of the categories and the retrieved documents. Another key component is the display of multiple selected categories simultaneously, complete with their hierarchical content. The prototype implementation uses animation and a three-dimensional graphical workspace to accommodate the category hierarchy and to store intermediate search results. Query specification in this 3D environment is accomplished via a novel method for painting Boolean queries over a combination of category labels and free text. Examples are shown on a collection of medical texts.

Additional Comments
While combining existing 3D animation with information retrieval for MedLine, this system attempts

The underlying ideas of hierarchical categories, and the graphical presentation of complicated relationships would require an experienced researcher to be used efficiently. Also, the user has at least four different ways of interacting with the interface - menus, gesture-based clicking, keyboard accelerators and traditional buttons - which may be overwhelming. The user-testing mentioned in the conclusion should be carefully considered in future work. Section 4 of the report compares Cat-a-Cone to other early graphical approaches.


PDQ Tree-Browser

Kumar, H., Plaisant, C., & Shneiderman, B. (1997). Browsing hierarchical data with multi-level dynamic queries and pruning. International Journal of Human-Computer Studies, 46, 103-124.

Author's Abstract
Users often must browse hierarchies with thousands of nodes in search of those that best match their information needs. The PDQ Tree-Browser (Pruning with Dynamic Queries) visualization tool was specified, designed and developed for this purpose. This tool presents trees in two tightly-coupled views, one a detailed view and the other an overview. Users can use dynamic queries, a method for rapidly filtering data, to filter nodes at each level of the tree. The dynamic query panels are user-customizable. Sub-trees of unselected nodes are pruned out, leading to compact views of relevant nodes. Usability testing of the PDQ Tree-browser, done with eight subjects, helped assess strengths and identify possible improvements. A controlled experiment, with 24 subjects, showed that pruning significantly improved performance speed and subjective user satisfaction. Future research directions are suggested.

Additional Comments
The Pruning with Dynamic Queries (PDQ) Tree-browser allows users to view hierarchical data in a detailed view and an overview concurrently. At each hierarchical level, users can select three attributes to query on, and various widgets are provided to specify these queries. Results are dynamically updated as the sliders/menus are changed for each attribute. Nodes that do not match the query are greyed-out and subtrees of these nodes are pruned completely and not shown. The idea is that with the irrelevant information no longer available, the user can examine the relevant information quicker and easier and make a more informed decision. The overview of related research was quite detailed and included references to many similar projects, like fish-eyes, and FilmFinder.


Main | Overviews | Classics | Demo Potential | Other systems | User-testing | Bibliographies
Compiled and annotated by Elizabeth Staley for Michael Twidale for Independent Study
Graduate School of Library and Information Science
University of Illinois
501 E. Daniel Street
Champaign, IL 61820
Please send comments or suggestions to e-staley@alexia.lis.uiuc.edu
Last updated 12 June 2000