Search DominoPower's 11,425 Lotus-related article archive 
Home
EasyPrint
News details Click here for the RSS feed's XML code. This is not a browser URL.
Articles-only Click here for the RSS feed's XML code. This is not a browser URL.
Twitter Feed Click here for the Twitter feed.
PROGRAMMING POWER
Integrating Notes content with Google Enterprise Search
By Colin Neale

I've been playing around with search now for a number of years. Just over a year ago, I was talking to someone about search appliances. An appliance is an out of the box search solutions that comes with all the hardware, OS and software pre-configured and ready to go. We were discussing the Google Search Appliance (or GSA, shown in Figure A) as I was interested to find out how it handled Lotus Notes content.

FIGURE A


The entry level Google Search Appliance GB-1001 is quite yellow. Roll over picture for a larger image.

Like many systems, the GSA will crawl Notes content over HTTP. However, not all Notes databases are Web enabled, and even where they are, this approach can be a somewhat hit and miss approach to getting at the underlying raw content of value, (i.e. the Notes documents themselves), especially for more complex applications.

"At this point (as every developer knows), the first time we run our software, it all works exactly as it was designed."

On the Google developer site, I found that the GSA accepts content from an external source through a feed submission process. Once indexed this content is made available for discovery alongside all the other types of data held on the appliance.

So I thought, why don't I adapt our own crawler for Notes "C-Search" into a crawl and feed system for the GSA. It can't be that difficult can it? After all, we had already done the hard part when we built the crawler.

I contacted Google and was lucky enough to be able to show them a mockup of how the system would work. Our system would:

  • allow administrators to choose which databases to index
  • select document sets for the GSA from template profiles built for each type of database to be indexed
  • choose which fields to index
  • keep a record of everything sent to the GSA to support incremental updates
  • handle search results authorization through integration with the GSAs own authentication and authorization SPIs

Understanding the challenges
Google seemed to like my idea, so with some confidence I set about the development. There were some challenges along the way. Here are just a few:

  • How do we handle Notes documents with multiple attachments and monitor changes to individual attachments on these documents?
  • What happens if the feed process fails half way through?
  • How do we break the XML files into manageable chunks for optimum performance?
  • How do we ensure Notes document level security is respected?

Development was done entirely from the Google API documentation before my brand-spanking-new appliance arrived at my door. When it did, in fact, arrive I was keen to see just how good (or bad) the documentation really was. It turned out to be near perfect. I had already installed our connector software, so I was able to focus on the appliance as soon as the server was delivered.


1  ·  2  ·  3  ·  Next »
Other articles you might like
Home > Strategies > Interoperability (15 articles)
   A Sametime plugin for Trillian
   Integrating Twitter with an IBM internal social network
   Fun with Sametime and Skype
Home > Lotus Technologies > Notes (84 articles)
   A walk down Memory Lane with Lotus Notes
   An application for scanning physical mail and distributing it virtually
   Managing Notes deployments with Teamstudio Build Manager
Home > Lotus Technologies > Application Development (48 articles)
   An application for scanning physical mail and distributing it virtually
   How hide-whens in Rich Text can ruin your whole day (and what to do about it)
   Little known traps about Lotus Notes fields
Home > Strategies > Document Management (14 articles)
   An application for scanning physical mail and distributing it virtually
   Evaluating your Domino Document Manager (Domino.Doc) transition options
   What to look for in a Domino-based document management solution
Get Weekly Email Updates
Subscribe to our regular weekly email newsletter. It's packed with tips, reviews, deep analysis, and the latest news.
 
Recent DominoPower Articles
Application development, William Shatner, and the origin of the universe
Learn Domino Designer 8.5 for free
The (near) future of Sametime, Quickr, Connections, and Symphony
Inside the IBM Innovations lab
Lotusphere 2010: Hot fixes and cool news for Notes, Domino, and LotusLive
Lotusphere 2010: mobility and collaboration
2010: A Lotusphere of change
Latest Lotus Headlines
Quickr place Superusers
Writing Client-Side Javascript for Re-Use
Lotus Notes R8.5.1: Bug in Contacts "Print Selected View"
New Notes/Domino Technotes published about Chile's extended daylight saving time
SnnT: How to prevent Google from listing your Sametime Server
How to send someone an email that shows your calendar availability
"The collection has become invalid"
>> Read all the news
More from the ZATZ journals
Computing Unplugged: The iPad defenders have spoken
David Gewirtz Online: CNN commentary and analysis
OutlookPower: More about disappearing text
-- Advertisement --

Learn Notes and Domino 8 at your place and pace!
Learn Notes and Domino in your office and/or home! TLCC's highly acclaimed distance learning courses for users, developers, and admins will enhance your career and your resume.

The many included activities and demos will make you a pro! Expert instructor help is a click away.

Click here to try a FREE demo course!!

-- Advertisement --

Mark your calendar for in-depth Lotus training, May 12-14, Boston
Join experts and peers May 12-14 in Boston for educational and networking events that deliver real-world Lotus training so you can increase productivity and efficiency in your company, advance your skills, and squeeze the most from your current environment. One registration gets you into THE VIEW's Admin2010 and Lotus Developer2010.

Register by April 10 to save $200.
ZATZ Home  ·  News  ·  Back Issues  ·  Credits/Trademarks ·  Link To Us
Copyright © 1998-2010, ZATZ Publishing. All rights reserved worldwide.
Editor's Login