Weighty Words: The Future of e-Books, Part 3

If e-books have been a commercial flop thus far, then how do Americans access books? In 2004-2005 we were divided equally between purchasing 2.3 billion books and checking out 2.4 billion books from libraries. Libraries remain a critical steward of our world’s knowledge. But even as the world wide web has made information more global, most libraries remain local in their focus. Number of Words Dedicated to Wikipedia Entries

For librarians, it must seem that the web has turned information gathering on its head. The Internet is a heady young fellow, self-obsessed, self-referential, and unflinchingly modern in its focus (see right). Libraries house history, centuries of wisdom buried deep in stacks, and even deeper in the un-searchable text of yellowing book pages. So how can libraries remain relevant?

Let’s first examine the mission of public libraries versus the mission of major search engines. The following excerpts are from the mission statements of several major libraries and a certain web giant – see if you can distinguish between them (hover over or click the link for the answers):

A. “Helping people advance knowledge to enrich lives

B. “…to organize the world’s information and make it universally accessible and useful.

C. “…to sustain and preserve a universal collection of knowledge and creativity for future generations.

D. “…collecting, cataloging, and conserving books and other materials…. to serve as a great storehouse of knowledge… and to function as an integral part of a fabric of information and learning that stretches across the nation and the world.

E. “…to create a comprehensive, searchable, virtual card catalog of all books in all languages that helps users discover new books and publishers discover new readers.


Private Sector Libraries

Clearly the missions of Google (and other search engines) are converging with those of the leading libraries. Google recognized this as an opportunity and launched the Google Library Project in late 2004. The project started with five major library partners, but has since extended to 11 libraries in three countries. Google is digitizing the contents of prestigious libraries such as Harvard, Stanford and Oxford, increasing access to tens of millions of unique books that were once accessible only to a small, elite group.

Google spends between $10 and $30 for every book it scans. The entire project, which will span at least a decade, will only cost Google the equivalent of its 4th quarter profits in 2006. Not a bad investment for the web giant.

Google has already made these books available on its Google Book Search, a fascinating portal that for the first time in human history opens up rare and not-so-rare books to anyone in the world. Not only are these books are fully text searchable, Google has recently announced an integration with Google Maps, making librarians, technologists and Google-philes giddy. No more leafing through musty books to find a quote or location.

Google’s Library Project is distinctive from several others such as the Open Content Alliance sponsored by Yahoo and Microsoft because Google is barreling ahead and scanning copyrighted texts. This has not only provoked lawsuits, but more importantly, it has also provided a necessary impetus to publishers and libraries to address the issue of how to manage copyrighted books in the digital era.

Google Book Search allows full text search for copyrighted works by simply telling you that the terms you searched for are in the book. Google then provides a tantalizing “snippet view” of the text as if it was torn right from the page. If you want to read the whole book online, however, you’ll have to wait. Rather than selling the e-book, Google paradoxically directs you to amazon.com which will happily mail you a hardback in 5 to 7 days for $21.95.

This is about to change dramatically. On January 21, 2007, Google announced to the Times of London that it would launch an e-book service. Details are murky, but it seems likely that users will be able to purchase all or part of copyrighted books. I can only reiterate that pricing matters. With e-books, publishers can increase the exposure of previously obscure books and eliminate publishing costs. Ideally, this will increase profit margins and create significant savings for consumers. Because digitized books are easily divided, e-books could lead to a new model of micropayments enabling consumers to purchase only what they need, be it a chapter, a paragraph, or even just a quote.


Public Library Reactions

Google’s mission is not without its critics. Jean-Noël Jeanneney, president of France’s Bibliothèque Nationale wrote a plaintive book called Google And the Myth of Universal Knowledge warning of Anglo-Saxon cultural imperialism and the risk of market-driven libraries:

“As anyone who uses Google knows, what is intrinsic to all the information it provides is hierarchization. Even if there are many pages of results, the searcher rarely goes beyond the first few. …The profit motive will necessarily promote one product over another.”

As long as there have been publishers with a marketing budget, there have been attempts to woo readers. And while the psychological effect of publishers’ advertising cannot be stopped at the door of library, our French friend would like to see it diminished.

There is some merit to this view, but not much. Libraries and library science will continue to weigh market forces against intellectual ones, but this new digital medium should not be made the culprit. If a library were to license, buy, or rent the contents Google’s digital library, couldn’t they simply reorganize it in a neutral, intellectualized way that would make even Mr. Dewey Decimal proud? Or should the public libraries simply create their own digital library system from scratch?


Public Library Initiatives

Most libraries do see the upside to digitizing their libraries or they wouldn’t be working with Google. In fact, Google recently gave a $3 million grant to the Library of Congress for its World Digital Library Project in conjunction with UNESCO. The project is focused on improving web access to rare materials that, “…are physically stored in geographically dispersed locations, and which, when brought together with other collections through cross-national and cross-cultural multilingual search and browse capabilities, will yield new knowledge and insights.”

The World Digital Library may sound ambitious, but its scope is much more limited than that of Google Books. It will focus primarily on the long end of the tail: rare cultural treasures that most of us don’t use, rather than popular literature that most of us check out from our local libraries.
Priorities: Cost of Digitizing All of the Books in the World Comparison

So we have the ivory tower approach and the commercial approach. Caught in the middle are the libraries that most Americans use.

Is Google the only answer? To be sure, they have a massive head start (see statastic below). In a decade they may have more books in their digital collection than any library system on earth. But if there is true intellectual concern about the earth’s largest library being in the hands of a profit-driven company, why not launch a public initiative? Can the U.S. government even afford it?

It’s all about priorities. If the U.S. government decided to scan and digitize every one of the 65 million books on earth, it would only cost about $2 billion. That’s less than we are spending for one week in Iraq, and it’s less than kids (I presume it’s kids) are paying for cell phone ring tones each year. We can afford it; so far, we just haven’t chosen to.

Even if the public sector did spend resources digitizing, libraries would face copyright issues. Tomorrow I will look at that as well as e-books initiatives at local libraries. And later this week, a case study that imagines the The DC Public Library in 2017.



Google versus the World's Largest Libraries

Sources and assumptions: Google has not disclosed the number of books it is digitizing or its timeline for completing the project. Early in the project Google claimed that it would scan 3,000 books per day. This figure was used for the low estimate. The high estimate was based on expanded date searches (1500 to 2007) on Google Books that returned about 4.5 million books. This was extrapolated back to the beginning of the project to find the scan rate, which was then used to project the high estimate. Statastic believes that the high estimate is probably more accurate because the 3,000 books per day figure referred to a contract with only the University of California library system. The fact that Google continues to add libraries to the project indicates that Google is likely to accelerate the scanning rate.

Weighty Words: The Future of e-Books, Part 2

With the advent of e-ink and e-paper, the only thing missing is the electronic content: e-books. E-books offer several clear advantages over print media: students able to tote all of their textbooks back and forth to classes in one 7 ounce e-reader; traveling the world with detailed guidebooks and foreign language dictionaries for dozens of countries; the ability to store hundreds of your favorite books in a tiny urban apartment; and finally, the potential to revolutionize libraries around the world (more on that tomorrow).

Many analysts agreed that e-books would revolutionize publishing. In 2000, Accenture predicted that e-books would make up 10% of the book market by 2005. Unfortunately, e-books didn’t live up to their great expectations. In fact, e-books only made up .07% of the 2.3 billion books sold in 2005: less than 1 of every 1200 books sold was in electronic format. Moreover, sales of e-books were flat between 2004 and 2005.

So why has the public rejected the digitization of print media? One problem is that, unlike CDs, there is no way to digitize your current library of paperbacks. E-books and e-readers also present the classic chicken and egg conundrum. Without most titles available in e-book form, expensive e-readers lose their appeal. And without flashy new e-readers to energize consumers (as iPod did for digital music), publishers are naturally less willing to commit to the new format.

Some of that may be changing: Apple’s new iPhone is making bloggers like Booksquare all tingly:

“…the iPhone could either kill the nascent e-reader business or take it to new levels. We’ve been saying just about forever that the problem with dedicated e-reader is the fact that the consumer isn’t seeking a device that does only one thing. With its “smart” orientation features, the iPhone could usher in the mass market e-book era.”

Even as Apple might revitalize the market, if they insist on Digital Rights Management (DRM) as they do in the music market, they may undermine their potential success. Just as Apple iTunes makes it difficult to share digital music downloads with friends, some e-book sellers impose similar restrictions. That makes the paperbacks more attractive than DRM-controlled e-books: you bought it, you can share it with friends. Not so with DRM e-books.

Public Domain Twain: Survey of E-book Prices for Huck FinnTraditional publishing houses are also delusional when it comes to pricing e-books. If you want to read Tom Sawyer’s The Adventures of Huckleberry Finn, for example, you can buy the Penguin Classic paperback for $5.95, or for a modest 10% discount, you can download the same Penguin Classic from ereader.com.

But consider this: Copyrights on many classic titles have entered public domain. This means that almost everything written before 1923 in the United States is free to use.

In the past, Penguin Classics made profit by reprinting classic titles that would otherwise be unavailable.The Internet changes the equation. Take away the public’s need for the printing press, and e-books would seemingly be a major threat to Penguin Classics.

Fear not, for lovers of classics there is good news: Project Gutenberg has taken the Wikipedia approach to sharing e-books in the public domain. With 20,000 free e-books in their catalog – including Huck Finn – Project Gutenberg claimed more than 2 million downloads last month. Contrast this with the 1.7 million e-books sold in all of 2005 and we see once again that consumers of e-books are extremely price sensitive. (More on price sensitivity in music here.)

And the pricing premium for e-books isn’t restricted to the classics. Jimmy Carter’s bestseller, Palestine: Peace Not Apartheid is actually cheaper in hardback at amazon.com ($16.20) than the e-book version at ereader.com or fictionwise.com ($16.99).

Publishers seem wedded to the paper publishing business model. This antiquated pricing model is bad for consumers and worse for the environment. If the 2.3 billion paper books sold in 2005 had been e-books, we would have saved more than 7 million trees. Until publishers drop prices and loosen Digital Rights Management restrictions, the convenience and sensibility of e-books may remain a pipe dream.

But what if our public libraries could help revolutionize the e-book market? More on that soon.
Great Expectations for e-books


Sources: Statastic research, Accenture, International Digital Publishing Forum

What Knots to Wear

Statastico has made some New Year’s resolutions:

1. I will update my blog five times a week.
2. I will try my darndest to provide at least one original statastic per week.
3. I’ll recommend some music that may help soothe your statastics-starved brains.

What does this mean to you the avid reader? It means that coming up with a clever idea, incisive analysis, statastics and graphics every day is more than a full time job… and that Statastico can’t do it alone. Rest assured, Web 2.0 – also known as Time Magazine’s Person of the Year – caught up to statastic! and churned out swivel.com.

Swivel is the Flickr of statistics and its user-generated (and statistically suspect) stats and graphs will challenge any of you bold enough to distinguish between correlation and causation. But it’s still good fun, and you have to admire wide-eyed entrepreneurs who staked their livelihood on the public’s thirst for more meaningless statistics.

Potosi MinesSo what has Statastico been up to? Glad you asked. Statastico was doing “research,” exploring the far reaches of the Incan Empire – from the apex of their power in Machu Picchu, to their tragic fate in the silver mines of Potosi, Bolivia at the hands of the Spanish conquistadors.

Seeing the quality of their stonework, the remnants of their agricultural prowess, and 500 year-old terraces still in use today makes one marvel that Pizarro so easily conquered this vast empire. The Incas governed a population of more than 15 million without the benefit of steel or the wheel. More shockingly, the Incas were the largest empire in the history of humankind without an alphabet or a written language (see chart below).

Or were they?

In January, Wired Magazine reports that there is an attempt to decipher Incan khipu textiles. The khipu may look like adornment, but these series of knotted strings were long assumed to be a type of abacus for recording census data. New research at Harvard, however, is exploring how the styles of knots, twists and colors in the string may form the basis of an Incan alphabet.

So far Harvard’s research is inconclusive, but their new approach applies network analysis based on the theory that different khipu textiles may refer to one another (much like Google PageRank). For any of you cryptophiles, Harvard has published the raw data here for you to noodle over.

In the interest of living up to Time Magazine’s Person of the Year honor, I thought I would offer some suggestions:

  1. 1. Many of the researchers focus on the khipu as stories to be handed down as a historical record. One of the advantages of knots as a form of a communication is its reusability. What if the khipu were more like portable blackboards constantly being written, erased, and rewritten in order to quickly send messages throughout the empire? This would change the nature of the translation. While researchers might be focused on translating a history book, they may be looking at the equivalent of ancient knot-based emails.
  2. .
  3. 2. Although the raw knot data seems pretty conclusive it might be worth enlisting the help of some folks who are so brilliant at mathematics that they created an esoteric sub-discipline known as knot theory. Here’s an example of some of the fun problems the folks at Williams College are considering: “Is the trefoil the only nontritangent knot? (A knot is nontritangent if there is a realization of that knot that does not have any planes tangent to the knot at three or more points.)”

In any case, I applaud Wired Magazine for running this article. Any time you can cross anthropology and google search algorithms, you have my attention. Now it time for the person of the year (you – not me) to decipher the khipu and save the Incas from the ignominy of being the most extensive empire without a written language.
Music Note Border 2While you’re busy untangling the khipu alphabet, have a listen to the Munich-based Notwist’s 2002 album Neon Golden: beepy, indie, minimalist, fuzz pop.


Evidence of Writing in the 40 Largest Empires

Web 2.0 Visits the Grocery Store

People are passionate about their online groceries. And they are very passionate about milk, specifically Tuscan Whole Milk sold at amazon.com. Reading these 700+ reviews of a gallon by teens and tweens from around the world did make Statastico feel a bit old. (And uninformed.)

But it does raise an issue: how much of the time we spend blogging, reviewing products and updating Wikipedia while at work? How does this affect worker productivity? Wikipedia arguably increases efficiency. Buying the product you want on the first try because of user feedback helps consumers and producers alike.

If I had more time, I’d try to correlate the average number of product reviews on Amazon to the unemployment rate. Alas, Statastico is going to the U.S. Open to watch Agassi beat Baghdatis tonight. Any volunteers Statasticos out there, or have you all gorged yourselves on a gallon of milk during the past hour?

Web 2.0 Visits the Grocry Store