 |
| |  |
Home In This Issue Email a Friend EasyPrint
 | |
|
DOCUMENT MANAGEMENT
Making business sense of classification technology choices
By Bain McKay
Last month, we talked about the need for automated classification systems. This month, we go into more detail.
While automated classification systems are coming to market using advanced technology, not all are created equal. How can you be sure you understand what matters about automated classification systems so you can make the right choice for your corporation? This month, we look at the two main and yet very different types of classification technology: neural network and adaptive clustering or classification.
Interestingly enough, these technologies use opposite methods to arrive at the clustering or classification of hierarchies. Neural network clustering classifies by difference in a batch process, whereas adaptive clustering technology classifies on sameness through incremental convergence in real-time. As a result, they each deliver a different business value beyond the immediate value of records classification for records management.
Neural network classification systems Neural network clustering technology, which has been around for some time, is aimed at simulating the way the brain classifies the volumes of information it compartmentalizes every second of every day -- clearly a massive volume information management problem in its own right. Neural networks compartmentalize data by highlighting the differences between documents based on a set of significant representative phrases contained in the documents. In effect, it builds walls between the data, delivering data silos.
Neural network classification hierarchies are developed by training the network on a representative sample of documents from the target domain. Because neural network technology is so time consuming, decisions must be made as to how much data should be sampled to build the classification hierarchy. Statistical methods are used to ensure that proper domain data sampling is done to arrive at an approximate representation of the document domain. Building a neural network can take two or more days of processing, depending on the size of the data sample and the power of the computer used in processing the neural network.
After neural network processing has been completed, the names of the nodes in the classification hierarchy must be edited to provide meaningful names that you understand. This is much like the manual editing process used in structured data modeling. After editing, the classification hierarchy is published to a server to begin its document classification work. Documents are processed through the classification hierarchy, which acts as a sorting bin, placing each document in the closest-fit classification node in the hierarchy.
[ Next ]
|
|
-- Advertisement --
Learn Notes and Domino 7 at your place and pace!
Learn Notes and Domino in your office and/or home! TLCC's highly acclaimed distance learning courses for users, developers, and admins will enhance your career and your resume.
The many included activities and demos will make you a pro! Expert instructor help is a click away. WebSphere courses are also available!
Click here to try a FREE demo course!! |
-- Advertisement --
Webcast: IBM Lotus Notes/Microsoft SharePoint Co-Existence Strategy
- Deliver easy access to SharePoint document libraries from Notes;
- Build business mash-ups across SharePoint data, Domino, Java and .NET applications;
- Implement a cross-platform enterprise content management strategy and store Notes emails on SharePoint sites
...without having to invest in high-cost migrations.
Register for the July 15th Webcast! |
Copyright © 1998-2008, ZATZ Publishing. All rights reserved worldwide.
|