Wednesday, December 23, 2009

Google Goggles

Google has an image search engine where the input itself is an image, not text as has been customary even in Google itself.

The project is called "Goggles." It's intended for use with your mobile phone as you are out and about. If you, for example, see a bottle of wine, book, or work of art that you would like to know more about, snap a picture of it, submit it to Google and view the results. You can also use Goggles to get more information from a business using a business card. Or take a picture of a store on a street to find out what else is in the neighborhood.

You can find more information about Goggles at:

Happy Holidays from Information Science News

A Merry Christmas to those who celebrate, and a Happy New Year to all! Here's wishing you a productive and rewarding 2010. May you find the things you search for.

I was trying to come up with a clever New Year's wish, and that last sentence was the best I could conjure. Still, it has a nice (George) Lucasian ring to it.

Happy Holidays y'all.

Tuesday, December 15, 2009

The Fourth Paradigm

There is a new collection of essays from Microsoft Research in honor of Jim Gray, the renowned researcher at Microsoft who was lost at sea in 2007. The collection is titled "The Fourth Paradigm: Data Intensive Scientific Discovery." According to an article by John Markoff of the New York Times, the book's essays are centered around the topic of how to deal with the explosion of data -- particularly scientific data -- in recent years. This particular topic has come up in this blog before -- read about it here.

Some sample essay titles: "Beyond the tsunami: developing the infrastructure
to deal with life sciences data," "Visualization for data-intensive science," and "From web 2.0 to the global database."

Access to this book is free -- it's available at the Microsoft research site at

Monday, December 14, 2009

After reading a mention of the site in a recent NY Times article about how programmers are using government data in new and interesting ways (including an mashup of maps and crime reports that will help a user navigate home from a pub using a path with the fewest reported crimes), I wanted to take a look. is a very well done site. The interface is very nicely designed, and a brief look at the available data makes we want to come up with an app myself. There is a load of raw data on this site -- weekly fatality reports, IRS immigration data, airport status data -- you name it.

Interestingly, two days after the article appeared, there was another article in the NY Times about how the White House is asking all agencies to get their data onto this site...perhaps someone in the White House read the initial article and got fired up with the coolness of the idea of citizens creating value from government data.

Wednesday, December 2, 2009

Great article in the New York Times about the oddities and curiosities that reside in the main branch of the New York Public Library. Some highlights: the cane Virginia Woolf left on the riverbank the day she committed suicide, as well as William Blake's hand-engraved 1793 version of “The Songs of Innocence."

Check it out at

Monday, November 30, 2009

Got my Google Wave

Yipee! I just got my Google Wave account and activated it. I have been playing around with it when I have a few minutes. My initial impression is that it's going to take me a bit to get the feel of what I can do with a "wave."

I have seen web-based collaboration software before with some of the features (like real time messaging) that I see in Wave. However, Wave adds nifty new features like a playback mode where you can step through each individual contribution to the Wave (someone typed in, posted a video whatever) to get a sense of the history of what happened.

Wave is a work in progress -- when I tried to do playback and then fast forward to the last message of a large wave, Wave crashed. It recovered gracefully though, and did not crash my browser -- it just told me I needed to refresh the page.

I'll try to write up some more impressions when I get more familiar with the tool.

Friday, November 6, 2009

Google Dashboard

Google now allows you to see some of the information they have collected on your from your various accounts (e.g., Gmail , Picasa, etc.). The announcement is available at: and the dashboard view is available at http://www/ (you need to log in with your Google username and password to see your data.)

I took a look, and I have to say that I was expecting to see more. What I did see, and what the New York Times Bits blog also noted, is that most of the information that is listed on the dashboard is simply the kind of humdrum stuff you already know about your account. For example, how many pictures you have stored in your Picasa Web Album and how many emails you have stored in Gmail. I guess I was expecting something more along the lines of "this dude is 40, male, and here's the last web site he visited," but there is nothing like that to be found in Dashboard.

The Bits blog entry also quotes John Simpson of Consumer Watchdog who says essentially that the information in Dashboard is the info linked to your name, but the real dirt is in the information linked to your IP address, which Google is not revealing for now.

Tuesday, October 13, 2009

Munging mountains of data

An interesting New York Times piece this week on training today's young minds to comprehend and manipulate the huge (and growing) amount of data available for processing by computer.

IBM and Google are lending a hand to universities so that students can leverage the big processing power needed to even begin to deal with this scale of data.

The article is available at:

Wednesday, June 10, 2009

Google Wave

Google Wave is a forthcoming "collaboration and communication" tool from Google. From the preview screenshots, I'd say it looks a little like some other collaboration tools I've seen, especially tools like Blackboard that are used by academic insitutions. The "real time" aspect looks to be a newish wrinkle though.

Like most Google tools, it's extensible, with it's own API.

According to, we should be hearing more about this tool soon.

Google Squared: Structured Data Search

A new addition to the Google Labs suite of experimental and beta software, Google Squared, presents search results in a structured format, similar to a spreadsheet.

Some reviewers have framed this as an effort by Google to keep pace with Wolfram Alpha and Cuil, two other search engines that present results in a structured format.

According to several reviewers, when search engines like this work, the results are pretty cool. When these engines fail, however, they fail pretty miserably. The general consensus seems to be that this is an interesting idea that needs work.

Friday, May 29, 2009


So Microsoft has a new search engine called "Bing," apparently intended to replace Live Search. (Meanwhile I'm still wondering what happened to Live Search Academic).

The first thing that comes to mind when I hear this word is the Ned Ryerson character in "Groundhog Day" using it repeatedly as in "Ned Ryerson, got the shingles real bad senior year, almost didn't graduate! Bing! Ned Ryerson, dated your sister Mary until you told me not to. Bing!"

So I guess the word is being used appropriately to denote the "aha" moment, the "sound of found," as Microsoft says.

The site itself is not available yet; it goes live next week. Microsoft has apparently been previewing the functionality to the press, and there are several reviews available. (Posted using ShareThis) has a particularly thorough review including a search-by-search comparison of Bing and Google.

Computerworld also has a nice hands on review titled "Hands on with Microsoft Bing" (also Posted using ShareThis)

Finally, there's a more newsy piece on the release in the New York Times.

Monday, May 11, 2009


The New York Times today talks about a new search engine from Stephen Wolfram, the creator of Mathematica software. The engine is not a Google/Yahoo-like web crawler, but rather a query interface to a large collection of data. WolframAlpha can answer queries like "ldl 120," for example, with charts of cholesterol levels in the U.S. population. This kind of search engine is less of a competitor and more of a complement to search engines like Google.

Friday, April 17, 2009

Is Perl dead?

As a young programmer working in the Systems Office at a large academic library in 1996, I discovered the wondrous utility of Perl for text processing and web programming. I hacked together a primitive service request program for users to report problems with desktops and dumb terminals from the old Geac GLIS 9000 System.

Lately it seems that Perl has fallen out of favor. Whether it's merely a generational thing or due to fundamental technical merits is a debatable question. The very question of whether Perl is in fact dead spurs lively discussion online. To wit, see the tersely worded site at To see more, just google "is perl dead" to see some fun results.

There have been a number of languages to emerge in the last 10 years that one could use instead of Perl to get the job done. Ruby, PHP, and Python are the first to come to mind, but there are many others.

We still have a significant Perl codebase where I currently work, although we are moving more and more code over to Java. I'm just wondering what everyone else's experience is out there. I'd love to hear from you about this!

Tuesday, April 14, 2009

Local search

I just stumbled on EveryBlock last week when a friend emailed me a link to inform me that the Chinese food truck on the street where we work was closed down by the Licenses and Inspection Department. Coincidentally, I was browsing the New York Times website this week and was surprised to find a story there on EveryBlock.

EveryBlock aggregrates data from a number of sources, including government information such as health inspection data. It's a neat site--you can find out about home sales and police reports in your neighborhood as well as seeing if anyone has geotagged a photo from your block in flickr of late.

Monday, April 6, 2009

New article in code4lib

Issue 6 of code4lib is now live, and contains an article by none other than this blogger. code4lib's mission is to "foster community and share information among those interested in the intersection of libraries, technology, and the future." The article I wrote fits nicely in with this mission statemtent; it's about using tree structures from GUI libraries to represent hierarchical data. Check it out at:

Friday, March 6, 2009

Google Visualization API

The Google Visualization API allows you to display "structured" data using a variety of different visualizations.

An overview and documentation is available at: You can view some samples at the Google Ajax APIs Playground. Some of the interesting examples include an annotated timeline and a motion chart (of the housing price index against the unemployment rate).

Sunday, March 1, 2009

Data Rot on Sunday Morning

David Pogue did a piece about "data rot" on CBS's Sunday Morning this weekend. In case you are not familiar with the term, it means the process by which data formats and hardware become outmoded or physically deteriorate over time, thus rendering the data on them unrecoverable. One woman in the segment talks about how she can't watch her first movie for film school as there are no longer any machines to play it on -- even though school was not that long ago for her. Also, the CIO for the Library of Congress was interviewed about LOC's efforts to preserve historical and culturally significant information.

Check out the story at the CBS news website.

Monday, February 23, 2009

New York Times article about the "Deep Web"

An article in the New York Times today talks about the latest efforts to plumb the Deep/Invisible Web. Google's efforts in the area are discussed, as is an effort partially funded by Jeffrey Bezos ( A University of Utah effort is also mentioned (

You can view the article online at:

Monday, February 9, 2009

Gmail Tasks and other things

On Monday, the Gmail blog weighed in on whether paper or the iPhone is the more productive tool. All of this as a humorous way to talk about GMail tasks, a new productivity tool available in Gmail (and also, be extension, on the iPhone).

In other news, there was an article in the New York Times the other day about digital archivists being in demand right now:

Tuesday, January 20, 2009


Since I am now in a search for a proper external hard drive or other storage medium to store my accumulating collection of videos and photographs (mostly of my adorable 6 month old), I was quite intrigued this morning to hear an update on the status of "GDrive," the rumored online storage service from none other than Google.

You can see the story on Search Engine Land at:
GDrive: It’s Alive! — Or So It Appears

(Posted using ShareThis)

Monday, January 19, 2009

A New Year

A busy quarter at Drexel has kept me from posting for the past few months. There is a lot going on however. I went to ASIST in October for the first time and had a great time meeting people and listening to a lot of great presentations.

I hope to post more regularly in the coming weeks.

One of the things I'm thinking about now is access to information for programmers and other technical folks, especially at the introductory/tutorial level. Is this information organized and easily accessible on the Internet? I think it may be a matter of finding the right search string or error message to dump into Google. I think this type of access would be complemented nicely by browsable list of resources. Does anyone know of a good one(s)?