|
|
|
|
|
|
|
|
|
|
|
|
|
|
DOCUMENT MANAGEMENT
Making business sense of classification technology choices
By Bain McKay
Last month, we talked about the need for automated classification systems. This month, we go into more detail.
While automated classification systems are coming to market using advanced technology, not all are created equal. How can you be sure you understand what matters about automated classification systems so you can make the right choice for your corporation? This month, we look at the two main and yet very different types of classification technology: neural network and adaptive clustering or classification.
Interestingly enough, these technologies use opposite methods to arrive at the clustering or classification of hierarchies. Neural network clustering classifies by difference in a batch process, whereas adaptive clustering technology classifies on sameness through incremental convergence in real-time. As a result, they each deliver a different business value beyond the immediate value of records classification for records management.
Neural network classification systems Neural network clustering technology, which has been around for some time, is aimed at simulating the way the brain classifies the volumes of information it compartmentalizes every second of every day -- clearly a massive volume information management problem in its own right. Neural networks compartmentalize data by highlighting the differences between documents based on a set of significant representative phrases contained in the documents. In effect, it builds walls between the data, delivering data silos.
Neural network classification hierarchies are developed by training the network on a representative sample of documents from the target domain. Because neural network technology is so time consuming, decisions must be made as to how much data should be sampled to build the classification hierarchy. Statistical methods are used to ensure that proper domain data sampling is done to arrive at an approximate representation of the document domain. Building a neural network can take two or more days of processing, depending on the size of the data sample and the power of the computer used in processing the neural network.
After neural network processing has been completed, the names of the nodes in the classification hierarchy must be edited to provide meaningful names that you understand. This is much like the manual editing process used in structured data modeling. After editing, the classification hierarchy is published to a server to begin its document classification work. Documents are processed through the classification hierarchy, which acts as a sorting bin, placing each document in the closest-fit classification node in the hierarchy.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
-- Advertisement --
Find unused Lotus Notes groups and clean up your address book
Have you ever wanted to get rid of old Lotus Notes groups that were cluttering up your address book, but you weren't sure if they were used? Find Unused Groups can help.
Find Unused Groups will check your ACL, mail, multi purpose and server groups to help you determine if they are used, and who uses them.
Learn how to easily clean up your address book. |
-- Advertisement --
Mark your calendar for in-depth Lotus training, May 12-14, Boston
Join experts and peers May 12-14 in Boston for educational and networking events that deliver real-world Lotus training so you can increase productivity and efficiency in your company, advance your skills, and squeeze the most from your current environment. One registration gets you into THE VIEW's Admin2010 and Lotus Developer2010.
Register by April 10 to save $200. |
|
|
|
|
|
|
|
|
|
|