Virtual Library Studies: February 2006

Saturday, February 18, 2006

MARC Cataloging

MARC is an acronym for MAchine-Readable Cataloging. It is a communications standard for exchanging bibliographic, holdings, and other data between libraries. It defines a bibliographic data format that emerged from a United States Library of Congress led initiative that began in the 1970s. It provides the protocol by which computers exchange, use, and interpret bibliographic information. Its data elements make up the foundation of most library catalogs used today.
The MARC Standards Office is part of the Library of Congress.
The record structure of MARC is an implementation of ISO 2709, also known as ANSI/NISO Z39.2.
The future of the MARC formats is a matter of some debate in the worldwide library science community. On the one hand, the formats are quite complex and are based on outdated technology. On the other, there is no alternative bibliographic format with an equivalent degree of granularity.

Authority records -- MARC authority records provide information about individual names, subjects, and uniform titles. An authority record establishes an authorized form of each heading, with references as appropriate from other forms of the heading.

Bibliographic records -- MARC bibliographic records describe the intellectual and physical characteristics of bibliographic resources (books, sound recordings, video recordings, and so forth).

Holdings records -- MARC holdings records provide copy-specific information on a library resource (call number, shelf location, and so forth).

Wednesday, February 15, 2006

Library of Congress Classification

The Library of Congress Classification (LCC) is a system of library classification developed by the Library of Congress. It is used by most research and university libraries in the U.S. and several other countries — most public libraries continue to use the Dewey Decimal Classification (DDC). It is not to be confused with the Library of Congress Subject Headings.
The classification was originally developed by Herbert Putnam with the advice of Charles Ammi Cutter in 1897 before he assumed the librarianship of Congress. It was influenced by Cutter Expansive Classification, DDC, and was designed for the use by the Library of Congress. The new system replaced a fixed location system developed by Thomas Jefferson. By the time of Putnam's departure from his post in 1939 all the classes except K (Law) and parts of B (Philosophy and Religion) were well developed. It has been criticized as lacking a sound theoretical basis; many of the classification decisions were driven by the particular practical needs of that library, rather than considerations of epistemological elegance.
Although it divides subjects into broad categories, it is essentially enumerative in nature.
The National Library of Medicine classification system (NLM) uses unused letters W and QS-QZ. Some libraries use NLM in conjuction with LCC, not using LCC's R (Medicine).

Sunday, February 12, 2006

Melvil Dewey

Melvil Dewey (December 10, 1851–December 26, 1931) was the inventor of the Dewey Decimal Classification system for library classification.
Dewey was born Melville Louis Kossuth Dewey in Adams Center, New York in the United States. He attended Amherst College, graduating in 1874. It was while working as an assistant librarian at Amherst from 1874 until 1877 that Dewey devised his decimal system.
He moved to Boston where he founded and edited Library Journal, which became an influential factor in the development of libraries in America, and in the reform of their administration.
With his friend and fellow librarian Charles Ammi Cutter, he helped found the American Library Association (ALA); both men spoke at the First Annual ALA Conference held in Boston, Massachusetts in 1876.
In 1883 he became librarian of Columbia College, and in the following year founded there the Columbia School of Library Economy, the first institution for the instruction of librarians ever organized. This school, which was very successful, was removed to Albany, New York in 1890, where it was reestablished as the New York State Library School under his direction; from 1888 to 1906 he was director of the New York State Library and from 1888 to 1900 was secretary of the University of the State of New York, completely reorganizing the state library, which he made one of the most efficient in America, and establishing the system of state travelling libraries and picture collections. In 1890 he helped to found the first state library association - the New York Library Association (NYLA) - and he was its first president, from 1890-1892.
He was an advocate of English language spelling reform and is responsible for, among other things, the "American" spelling of the word Catalog (as opposed to the British Catalogue). He changed his own name from Melville Louis Kossuth Dewey to simply Melvil Dui. He also sponsored periodicals on the Ro constructed language, in which the word structure marked its meaning in a hierarchy of categories.
While remembered for his Dewey Decimal System, Dewey's personal views would be highly controversial today. He was extremely racist against African Americans and other minorities, as well as anti-Semitic and anti-women's rights. He also advocated segregation of races.
Dewey is a member of the American Library Association's Hall of Fame.

Saturday, February 11, 2006

Thomas Jefferson

Thomas Jefferson (April 13, 1743 N.S. – July 4, 1826) was the third President of the United States (1801–1809), principal author of the Declaration of Independence (1776), and one of the most influential founders of the United States. Major events during his presidency include the Louisiana Purchase (1803), the Embargo Act of 1807, and the Lewis and Clark Expedition (1804–1806).
A political philosopher who promoted classical liberalism, republicanism, and the separation of church and state, he was the author of the Virginia Statute for Religious Freedom (1779, 1786), which was the basis of the Establishment Clause of the First Amendment of the United States Constitution. He was the eponym of Jeffersonian democracy and the founder and leader of the Democratic-Republican Party which dominated American politics for over a quarter-century and was the precursor to today's Democratic Party. Jefferson also served as the second Governor of Virginia (1779–1781), first United States Secretary of State (1789–1795), and second Vice President (1797–1801).

After the British burned Washington and the Library of Congress in August 1814, Jefferson offered his own collection to the nation. In January 1815, Congress accepted his offer, appropriating $23,950 for his 6,487 books, and the foundation was laid for a great national library. Today, the Library of Congress' website for federal legislative information is named THOMAS, in honor of Jefferson.The range of his interests is remarkable. For many years he was President of the American Philosophical Society.

Thomas Jefferson was also an agriculturalist, horticulturist, architect, etymologist, archaeologist, mathematician, cryptographer, surveyor, paleontologist, author, lawyer, inventor, violinist, and the founder of the University of Virginia. Many people consider Jefferson to be among the most brilliant men ever to occupy the Presidency. President John F. Kennedy welcomed forty-nine Nobel Prize winners to the White House in 1962, saying, "I think this is the most extraordinary collection of talent, of human knowledge, that has ever been gathered at the White House, with the possible exception of when Thomas Jefferson dined alone."

Friday, February 10, 2006

Search Engine Class

It is hard to imagine how people lived in dark times before search engines were developed. Going back to the early 1990s we find the first search tools which where helpful in finding information stored on computer systems. The very first tool used for searching on the Internet was created in 1990 by by Alan Emtage, a student at McGill University in Montreal and was called Archie. This program was able to create database of file names, or index of files, but could not read or search the content of the files.

Few years later first Internet search engine called Wandex was created by the World Wide Web Wanderer, a web crawler developed by Matthew Gray at MIT in 1993. Another very early search engine, Aliweb, also appeared in 1993, and still runs today. The first full text crawler based search engine was WebCrawler, which came into use in 1994. Unlike its predecessors, it let users search for any word in any web page, which became the standard for all major search engines ever since. In 1994 Lycos was started at Carnegie Mellon University, and it became very popular and commercially successful enterprise.

WebCrawler innovative metasearch technology

Soon after, many search engines appeared and started to compete for popularity. These included Excite, Infoseek, Inktomi, Northern Light, and AltaVista. In January of 1994 Yahoo was started and it was first known as "Jerry's Guide to the World Wide Web". At first it was a directory of other sites, organized in a hierarchy (rather than a searchable index of pages). It was renamed "Yahoo!" shortly thereafter. Today Yahoo is the most visited website on the Internet with 412 million unique users and has $5 billion in revenues and 11,000 employees.

Search Engine Relationship Chart

Google started as a research project in early 1996 by Larry Page and Sergey Brin, who were postgraduate students at Stanford University in California. The existing search engines at that time ranked results according to how many times the search term appeared on a page. And that created a situation where someone could manipulate the search results by increasing the number of specific words in order to appear on top of the list. Google was the fist successful attempt to analyze the relationships and links between websites.

Convinced that the pages with the most links to them from other highly relevant web pages must be the most relevant pages associated with the search, Page and Brin tested their thesis as part of their studies, and laid the foundation for their search engine. Originally the search engine used the Stanford University website with the domain google.stanford.edu. The domain google.com was registered on September 14, 1997, and the company was incorporated as Google Inc. on September 7, 1998 at a friend's garage in Menlo Park, California.

The original Google website as it looked in 1996.

All search engines today work by storing information about tens of billions of web pages. These pages are retrieved by a web crawler, called also a web spider — an automated software agent which follows every link it sees. The contents of each page are then analyzed to determine how it should be indexed. In Google case the indexing of the web pages is performed by a program named Googlebot, which periodically requests new copies of web pages it already knows about. Data about web pages are stored in an index database for use in later queries. Storing of such large amount of information is very costly. Simply storing 10 billion pages of 10 kbytes each in size requires 100TB and another 100TB or so for indexes, giving a total hardware cost of around $200k: 400 500GB disk drives on 100 computers. By the end of 2005 Google claimed that its index has over 25 billion web pages and 1.3 billion images, 1 billion Usenet messages, 6,600 print catalogs, and 4,500 news sources.

Find what people are searching for with Zeitgeist

Google's popularity has grown as people were attracted to its simple and clear design. Most people prefer not to have visual distractions while entering searches on web pages. This appearance was not an original idea and imitated AltaVista's, but included Google's unique search capabilities. In 2000, Google began selling advertisements associated with search keywords. This strategy was important for creating a financialy strong company . The ads were text-based in order to maintain an uncluttered page design and to maximize page loading speed. Keywords were sold with bidding starting at $.05 per click. This model of selling keyword advertising was pioneered by Goto.com. However while many companies have failed in the new Internet advertising domain, Google was generating increasing profits.

The key concept behind Google is PageRank which is a method assigning the relative importance of pages on the Internet from value 0 to value 10. Where page with 0 value is least important and page with value 10 is the most important page. PageRank results from a "ballot" among all the other pages on the Internet about how important the page is. A hyperlink to a page counts as a vote of support. The PageRank of a page is defined recursively and depends on the number and PageRank metric of all pages that link to it which are called incoming links. A page that is linked to by many pages with high PageRank receives a high rank itself. If there are no links to a web page there is no support for that page. U.S. Patent 6,285,999 describing part of Google's ranking mechanism (PageRank) was granted on September 4, 2001. The patent was officially assigned to Stanford University and lists Lawrence Page as the inventor.

Many academic papers concerning PageRank have been published since Larry Page and Sergey Brin's original paper. In practice, the PageRank system has proven to be vulnerable to manipulation, and extensive research has been devoted to identifying falsely inflated PageRank and ways to ignore links from documents with falsely inflated PageRank.

Who manipulates Google and why?

Advanced commands on Google the easy way.

Please give your feedback on search engine class by using comment system below.

Did the class meet your expectations?
Will you incorporate what you have learned?
Any other comments are welcomed.

Thursday, February 02, 2006

Access to Literature

Promote client access to literature subject deals with introduction of the wide range of literature and promoting client effective access to the verity of sources.
Some of the study links below:
1. http://www.amazon.com - online books, CDs, videos, DVDs and more.
2. http://www.bartleby.com - searchable quotations site.
3. http://www.evanston.lib.il.us/library/bibliographies - Evanston library.
4. http://www.genrefluent.com - the world of genre fiction.
5. http://www.libraryspot.com - huge reference collection.
6. http://www.nancypearl.com - books worth calling in sick for!
7. http://www.refdesk.com - one-stop site for all Internet things.
8. http://www.webrary.org - Morton Grove Public Library.
9. http://www.whichbook.net - find books to match your mood.

Virtual Library Studies