Primary information sources these days can
be part of the following overlapping categories, which are
mentioned here more or less in chronological order:
- millions of books
- thousands of journals, together offering
millions of articles
- thousands of discussion groups based on
electronic mail, on Usenet, and more recently on the WWW
or on combinations of these
- more than 10 billion static, public
access WWW pages and other files, that include text,
images, multimedia, hypermedia and computer programs, plus
similar items that are not present and available as static
stand-alone files, but more deeply stored in databases
that can be extracted, exploited through the WWW
- open archives / repositories of
documents, set up and maintained mainly by scientific
organizations
In view of this huge volume, pointing out a
selection of interesting sources for a general audience is
not feasible or meaningful. Therefore, this tutorial
presentation gives an overview of information services
that help us to find, locate, access and use primary
information sources that are needed for some purpose,
fast, efficiently, effectively.
The information services can be categorized
as
A. public services, available to all users
through the WWW on a worldwide scale;
B. systems that have been set up by an
organization for its members only, offering more focused
and targeted services that make sense and offer added
value primarily in the framework of this organization.
Ad A:
The information services for all can be
categorized further:
- they can work in open access, available
for all users through the WWW free of charge,
- or they are available to a user only
after a fee has been paid by this user directly or at a
higher level by his/her organization, region or country.
Here we spend most attention to
- open access services,
- academic, scholarly, scientific and
technical information.
The information services for all include
the following:
- Some general, horizontal subject
directories guide us to selected WWW sites. Well-known
directories are accessible free of charge.
- More specialized, vertical subject
directories guide us to WWW sites in some specific subject
area or related to some specific geographical area. Many
are available free of charge.
- General, horizontal search engines help
us to find WWW pages and other files by using search
queries. Famous examples are today Google Web Search,
Microsoft Live, Search Yahoo, Ask. A few other ones offer
categorization and clustering of results on similar
subjects, to cope with the classical problem in
information retrieval of ambiguity of meaning of words in
a query, such as Mooter and Wisenut. Most of these are
accessible free of charge.
- Meta-WWW-search engines transmit the
user’s query to existing general WWW search engines and
combine and merge the obtained search results. Remarkable
among these is Vivisimo / Clusty, which offers
categorization of results to cope with the classical
problem in information retrieval of ambiguity of meaning
of words in a query. Most are accessible free of charge.
- Several systems to search for information
offer not only automatic categorization/clustering but
also graphical visualization of the categories/clusters on
the user’s computer display. Examples are Kartoo and
Grokker. Several applications on the WWW are available
free of charge.
- Several databases offer bibliographic
descriptions of books and related information. Examples
are the big online shop Amazon, which includes a bookshop,
and of course the book databases/catalogues of many
libraries, such as the great Library of Congress and the
British Library. Most are available free of charge.
- Several open access meta-search systems
allow us even to search in one action through several
databases of bookshops; an example is Antiqbook. Some
systems even include other meta-search systems in their
targets; an example is Addall that searches for instance
in Amazon and in Antiqbook mentioned above.
- A few databases allow even full-text
searches of the contents of a selection of books. Examples
are Amazon and Google Book Search, up to now free of
charge in the search phase.
- A few general, horizontal open access
databases with search engines allow us to find article
titles in most areas of science and technology, such as
Infotrieve ArticleFinder, Ingenta, Scirus, and most
recently Google Scholar.
- Thousands of more specialized, vertical
databases allow us to find published articles and other
documents in some specific domain of science. However,
production is expensive and only few are accessible free
of charge, such as Pubmed for biomedicine and Eric for
educational science and library and information science.
- A specialised database for searching and
browsing allows us to find scholarly open access journals
and articles published in these journals: the Directory of
Open Access Journals = DOAJ. Access is free of charge.
- Several more specialized systems allow
searching through open archives/repositories, such as
Scirus, OAIster and more recently Google Scholar.
- More specialised search engines allow us
to find images on the WWW. A famous and popular example
these days is the open access Google Image Search.
- Current awareness services can alert a
user when a new document has become available, which
corresponds well with the user profile stored with the
system. For public access WWW pages as well as for
newsgroups, Google Alerts offer such a service free of
charge.
- A few systems allow not only subject searching, but also
citation searching as an alternative method to find
scholarly information. A well-established system is the
Web of Science, but this is costly. Scopus by Elsevier is
a new competitor that is also expensive. The open access
alternative is Google Scholar. These systems also allow
some evaluation of an article through the number and types
of citations received by that article.
- A database, browsing system and search engine allows you
to find Usenet newsgroups and newsgroup articles generated
during recent years: Google Groups. Usage is free of
charge.
Ad B:
Systems that have been set up by an organization for its
members only, can offer more focused and targeted services
that make sense and that offer added value in the
framework of the organization. Many universities have
already installed such systems, in most cases under the
responsibility of the library or information center. The
two main building blocks of such a system are described
briefly below:
1. A system for federated searching = meta-searching
allows searching through several databases / search
engines simultaneously, in one action, to save time and to
obtain a higher recall. In contrast with public access
systems, such an institutional service can take into
account local interests and licenses to access particular
databases that are not offered in open access.
2. Starting from an item which has been found and is known
by the user, a link generator can create relevant links to
related documents or services on the WWW, which are
hopefully relevant for the user. This takes into account
the context, the local situation, including geographical
location and contents and services available to the
particular user. The most obvious, classical example of an
application of such a link generator is offering a link to
bring a user from a known bibliographic description of an
article to a WWW full-text database of the publisher, to
obtain the full-text article fast and directly through the
WWW. Many other paths are possible and interesting.
Conclusions:
- An increasing amount of information
becomes available online. A growing amount of this online
information becomes available free of charge. The quality
and ease of use of software on the server as well as on
the client computers is growing. Furthermore, many
secondary information sources / services are now also
accessible free of charge. As a consequence, an increasing
number of end-users search for information online.
- The relatively new companies Yahoo! and
more recently Google have become important, famous and
popular players in the information industry. They offer
access to many information sources and services, more or
less integrated in their portal.
- In the case of simple, factual
information needs, the WWW and the search tools can work
fast and efficient, like “magic”. However, in the case of
more complicated information needs, there is still is no
“magic button” that brings us immediately to all the
required information.
- Besides the public information services
on the WWW for all, an increasing number of organizations
implement additional digital information services that
offer added value by taking into account the local
context.
Looking into the future, we can hope for
systems that offer
- better GUIDANCE in formulating queries,
in particular to cope with the problems associated with
using human natural language to express information needs,
so that recall and precision are increased
- better automatic subject CATEGORIZATION
of database contents and search results
- more and better VISUALIZATION of database
contents and search results, to guide searching
- more OPEN ACCESS primary information
sources such as articles, atlases, books, learning
materials…
(concretely for instance: growth of the OAI data layer)
- more open access information SERVICES for
navigation, searching, citation analysis…
(concretely for instance: more and better systems in the
OAI services layer)
- more open and more standardized API’s
(programming interfaces) of systems to search databases /
search engines = format of INPUT of queries and also
database contents that consists of well structured
records/items, with a standardized structure, so that
these systems can be incorporated more efficiently at the
client side in federated search systems and other
information services = better integration of information
sources, NOT locked behind their own proprietary user
interface
- more open and more well-structured and
more standardized OUTPUT of databases / search engines, so
that the documents/items extracted from these systems can
serve efficiently at the side of the client as input into
a local link generator or a database or some system to
analyze the documents/items
More information is available through the
WWW:
about the author:
note: BIBLIO in uppercase (not “biblio” in
lowercase)
http://www.vub.ac.be/BIBLIO/nieuwenhuysen/professional/
other tutorial materials available:
http://www.vub.ac.be/BIBLIO/nieuwenhuysen/courses/chapters/