Panning for Gold on the Internet
Terms that were unheard of yesterday become part of everyday language.
One of the best parts of my job is discovery. Terms that were unheard of yesterday become part of everyday language. Here's a new one -- "text mining" -- that Dow Chemical brings to our attention.
The R&D function of Dow Chemical Co., Midland, Mich., is using cutting-edge technical capabilities in its information-technology laboratory to locate data from hundreds of thousands of documents on the Internet and from other sources as well. The technology is known as "text mining." The goal of the research is to discover knowledge and patterns of information that are nonrecognizable and nonretrievable using traditional database-management or search-engine tools.
Dow first began exploring opportunities in text mining during 1996. Dow is now working with ClearForest Corp. in New York City (www.clearforest.com) for its latest text-mining capability.
Text mining allows Dow to more fully explore complex relationships among the contents of documents in textual databases. It also provides a visual interface for documentation.
"The analogy we like to use is panning for gold," said Randy Collard of Dow R&D. "If you think of the Internet as a stream, you do not need or want everything that is in that stream, just the gold nuggets. Text mining allows you to find those nuggets of information very effectively. Plus, it makes it possible to discover relationships that are not obvious."
One advantage of text mining is finding both expected and unexpected or hidden relationships. Using this capability, Dow can search for new customers, technologies, business partners or marketing trends being revealed in ways previously unavailable.
Text mining should not be confused with traditional Internet search-engine tools or database-management capabilities. Text mining occurs after a traditional search for documents is completed, in whatever format is used -- whether full text, abstracts or indexed terms. Text mining allows for exploration of complex relationships among documents. There is a visual interface, so researchers can actually see what and where significant patterns exist.
ClearForest Technology is a current tool of choice for Dow. While other systems offer different capabilities, there is no single tool available yet that can accommodate the needs of a company the size of Dow. An organized database is a prerequisite for information-management tools to be used to their utmost. The best text-mining tools are the ones that can extract previously unknown knowledge from unorganized servers. Dow is on the cutting edge in the use of this emerging field.
Dow is a leading science and technology company that provides innovative chemical, plastic and agricultural products and services to many essential consumer markets. With annual sales of approximately $30 billion, Dow serves customers in more than 170 countries and a wide range of markets that are vital to human progress, including food, transportation, health and medicine, personal and home care, and building and construction, among others. Committed to the principles of Sustainable Development, Dow and its approximately 50,000 employees seek to balance economic, environmental and social responsibilities. For further information, visit Dow's Web site.