Developing a Strategic Platform for Searching and Retrieving Corporate Knowledge

Pete Tierney
Chairman, President & CEO

Copyright 1995, Inference Corporation, All rights reserved


Every day, in the United States alone, over nine million people go to work in jobs involving daily telephone contact with prospects, customers, and vendors. That's almost ten percent of the entire workforce. Seventy percent of those individuals deal predominantly with "incoming" calls, and many of those calls are made using "800" number or other toll free telephone options. Worldwide, the telephone offers businesses the means to interact with and serve their constituencies as never before.

This explosion in telephone-based customer contact has spawned a host of new technological and organizational innovations - sophisticated telephone switches, computer/telephony integration, integrated voice response, speech recognition, amply staffed call centers, and a host of other changes. The Aberdeen Group, in a recent report (May 1995), estimated that over $300 million was spent in 1994 on software tools that manage customer contacts, and it forecast that this market segment will expand to $1.5 billion by the end of 1998.

In addition, the emphasis in successful business planning and reengineering projects is moving toward more competitively advantageous ways to serve and respond to requests for assistance from customers, prospects, and vendors. These daily contacts, handled mostly by front office employees who interact regularly with people outside the organization, are the lifeblood of business in our increasingly service-related economy.

Front office personnel survive and succeed by integrating multiple sources of information and responding quickly and effectively to customer, prospect, and vendor requests. In this environment, information goes far beyond standard structured information, such as online customer master records. It includes the much larger world of unstructured information - corporate intellectual capital or know-how, as well as information found in sales and product documents and technical and policy manuals. In fact, the employees who seem most adept at using standard structured information to access and integrate unstructured information are often the most successful at what they do.

Tom Davenport of the University of Texas (in an article entitled "Coming Soon: The CKO" in the September 5, 1994, issue of Information Week) outlined the importance of managing a corporation's know-how - its intellectual capital. Mr. Davenport identified a need to create a new corporate position called Chief Knowledge Officer (CKO) that will be responsible for managing the firm's intellectual capital. In a related article, Fortune magazine's October 3, 1994, cover story equated "Intellectual Capital" with "Your Company's Most Valuable Asset." There is, clearly, a growing awareness of the increasing need to capitalize on such knowledge and put it to use in software applications, particularly in the front office.

Traditionally, however, much of the investment in information technology (IT) has been in the back office, where the business infrastructure is located and where the structured information is managed. Yet, more recently, several factors, such as the skyrocketing manpower costs of sales and customer support organizations, the money wasted on incorrect solutions or incorrectly recommended product configurations, lag times in responding to changes in product lines in customer service operations, and the increasingly competitive nature of service as a component of an organization's business solution are changing the pattern of IT investment.

These factors have resulted in initiatives that place more and more attention on providing support systems for front office personnel. The objectives are to improve the cost and effectiveness of the critical interactions between the company and its market. That is why call centers are now being staffed and managed with significant investments in computer/telephone integration (CTI) and integrated computer systems. This technology provides prospect and call tracking, request ticketing, problem identification and resolution, and product recommendation. However, the rapidly rising cost of handling calls and the focus on customer service as a competitive weapon are generating interest in alternative methods. For example, customers can interact directly with applications that let them solve problems and gather information on their own.

In addition, the advent of a new generation of computing architecture (client/server computing) has created a new class of customer - the internal user of information technology - who is dealing with numerous new applications based on technological innovations introduced over the past ten years. The demands placed on users is further complicated when these applications and innovations are used in mission-critical situations. These internal users are now being viewed, especially by organization's that have a "service" culture, as a new class of "customer" and are being served by what is generally called the help desk.

The help desk itself is staffed by people whose objective is to solve users' technical problems and assist them in getting the most from the information technology tools they use throughout the corporation. As a result, internal help desks are also being established in areas such as human resources and benefits administration.

The net effect is that an increasing level of IT investment is shifting to the front office where it directly supports customers (both internal and external), prospects, and vendors. Front office personal computers are being converted from productivity tools to system tools, integrating front office management with back office infrastructure. Newer desktop technologies, client/server architectures, and telephony systems that integrate computer databases, knowledgebases, multimedia annotations, voice response, and speech enabling systems are being used to implement these front office applications and, in some cases, are replacing mainframe or minicomputer applications.

The existing model for these systems is to build or buy an application template that automates and tracks customer support requests and then integrate the application with existing infrastructure such as electronic mail, network management, asset management, quality reporting, etc. In this scenario, structured customer and prospect data is normally stored in a relational database from which it can be rapidly accessed using deterministic queries that are usually generated in the application itself.

A fundamental component in the success of client/server front office systems is the ability to access and integrate corporate knowledge and make it available to front office personnel. This corporate knowledge, when it is readily accessible by users, improves the responsiveness and quality of problem identification and resolutions, product configurations, and product recommendations.

Traditionally, providers of front office, client/server applications have taken a tactical approach by providing several methods for storing and accessing knowledge, such as text retrieval and artificial intelligence (AI) techniques including decision trees, neural networks, rule-based systems, and case-based reasoning. The most successful of these approaches has usually been text retrieval but, lately, case-based reasoning has enjoyed dramatic growth in acceptance and use.

These tactical CBR approaches are important and significant. They have demonstrated that corporate knowledge, intellectual capital and know-how, can be effectively applied in the front office. They have also shown that other front office applications can be integrated with both structured databases and unstructured information to form complete front office solutions.

This paper proposes to take CBR one step further, as a more strategic platform and as a standard for representing and retrieving knowledge. It discusses why having a standard is desirable and important and how it can be achieved. In particular, this paper addresses three basic premises.

I. The need for a strategic platform that stores and retrieves corporate knowledge.
II. Case-based reasoning is the emerging platform.
III. Inference's implementation of case-based reasoning is the leader in the market.

Tutorials and essays on case-based reasoning are numerous (see, in particular, Riesbeck and Schank, 1989, and Kolodner, 1993), and additional references can be found at the end of this paper. Rather than attempt to provide yet another tutorial, this paper will concentrate on the business issues that we believe "make the case" for case-based reasoning.

I. The need for a strategic platform that stores and retrieves corporate knowledge

In our experience in the front office market, the dominant model has been a tactical, application-based approach to software applications. Although there is almost universal understanding and acceptance of the relational model for use with the structured data component of these applications, the approaches to unstructured data (or knowledge) have been proprietary and tactical. Examples include the use of various ways to author and retrieve this knowledge using incompatible forms of text retrieval, as well as rule-based systems, case-based reasoning, decision trees, and neural nets.

The use of these isolated techniques in standalone applications has generated what the October 3, 1994 Fortune article referred to as "islands of knowledge." These islands create multiple sets of unleveraged knowledge bases, each with its own support systems. (A similar situation occurred in IT applications prior to the emergence of Structured Query Language - SQL - for retrieving and relational technology for storing data.) The only common element in this approach to managing knowledge, unfortunately, has been failure in light of any attempt to connect these knowledge stores in any corporate-wide knowledge framework, and company after company has struggled unsucccessfully with the individual support and update requirements of each application. While these knowledge systems may have succeeded within a single, tactical project, they have not been leveraged or have succeeded beyond that.

Of all of these varied approaches, only case-based reasoning appears to provide a strategic platform offering similar benefits to the common platform of the relational model.

The fundamental economics of the computer and software industry are based on the benefits derived from the adoption of common computing platforms. Clear examples of this include the use of the Intel-compatible processor, the Microsoft Windows operating system and the relational database, common platforms that allow companies to leverage central computing architectures for applications and standardized skill sets for systems, programming, and end user personnel. In addition, common platforms generate indirect benefits for buyers by concentrating Independent Software Vendor (ISV) resources on specific technologies, thereby accelerating the improvement of those technologies. This allows ISVs to establish and broaden markets in which to develop and sell improved products and components.

These principles of computing success should be applied to the use of knowledge in software applications. This would both increase the quality of the base technologies deployed and decrease the effective cost of implementing such systems. Without the use of a common, or strategic platform for knowledge, IT organizations must reengineer such knowledge for use in other systems, thereby creating the "islands of knowledge" problem. A strategic approach offers a means to both concentrate the authoring and maintenance of a knowledge base in a central effort but also a means for extended leverage through multiple deployment techniques in multiple applications. For example, in customer service, a traditional approach is to integrate the problem resolution knowledge base with the call tracking or problem management system. While offering a complete package approach, we find the knowledge investment is "captive" to a single application use, thereby limiting the ability to leverage that knowledge across other types of applications such as self help systems for customers via e-mail or internet access, or even speech-enabled interactions.

II. Case-based reasoning is the emerging platform

The qualification of any format to be accepted for common use is based on a number of existing conditions:

CBR qualifies in exemplary fashion in each of the areas listed.

In 1989, Roger Schank and Chris Riesbeck (Riesbeck & Schank, 1989) published a seminal book describing years of research in developing a new method of implementing knowledge technology. This method, developed at Yale University, mimicked the intuitive process of human thought and was named Case-Based Reasoning (CBR). While this technique grew out of previous research in human memory, recall, understanding, and learning (Schank, 1975; Schank & Abelson, 1977; Schank, 1983), CBR was also designed to provide a new model for computer systems that capture, combine, share, and retrieve knowledge among workers who can put such knowledge to use.

Research in CBR technology exploded in the mid to late 1980s when the Defense Advanced Research Projects Agency (DARPA) funded work to establish CBR as a unique research area and formulate initial models for CBR processing on computers. Much of the work in this area was documented in a series of DARPA workshops (Kolodner, 1988; Hammond, 1989; Bareiss, 1991), journals and conferences in cognitive science, machine learning, and artificial intelligence had extensive articles on CBR, both as a model of human reasoning and as a computer model for automated knowledge acquisition and use.

Early uses of CBR focused on one-off implementations of the technology, utilizing variations of the base algorithm and logic outlined in Shank and Riesbeck's book, which provided the information for use in the public domain. In addition, the growing interest about CBR in academia generated step-by-step improvements in the state of the art. A recent book, Case-based Reasoning by Janet Kolodner, at the Georgia Institute of Technology, further documents the model uses of the technology and presents a cookbook for building a case-based reasoning system.

Some of the basic concepts underlying CBR include (Kolodner, 1993):

These CBR concepts rapidly aligned themselves with several uses in software applications. For example,

In addition, there is a large following of CBR devotees who have organized to improve the state of the art for CBR and explore potential real-world uses for the technology. Numerous papers now document the progress of such research. Indeed, a recent Internet query identified no fewer than 35 recent papers published on or related to the subject of CBR. There are also several highly regarded conferences that focus on CBR, including both national and international forums.

Following the DARPA workshops, the American Association for Artificial Intelligence (AAAI) has sponsored a series of CBR workshops at its annual conference, and a new international conference has been organized (ICCBR'95 -- The International Conference on Case-Based Reasoning) which will be held for the first time in October 1995 in Portugal.

As stated earlier, a key measure of a common platform technology is that there be a core of public domain research establishing the base models plus a vigorous continuation of academic and industry research. CBR truly qualifies in that regard.

This research has spawned several commercial implementations of CBR from different companies, and numerous software firms are developing and shipping commercial CBR products worldwide. In the customer support and service marketplace, all of the top eight suppliers offer CBR-based problem identification and resolution technology along with their problem management and call tracking products.

In total, CBR now has been licensed and is in use by hundreds of corporations and literally hundreds of thousands of users worldwide, and it has been implemented in over a dozen different languages, which displays its adaptability. It has been successfully implemented in call centers, help desks, configuration systems, sales product recommendation systems, medical diagnosis, and a host of other re-use applications (Acorn & Walden, 1992; Allen, 1994; Barletta & Hennessy, 1989; Goodman, 1990; Iwata & Obama, 1991; Kleinert & Rao, 1995; Nguyen et al., 1993)

Perhaps the core of the growing enthusiasm for CBR in research and commerce lies not only in its intuitive nature and relative ease of implementation and use but, like SQL and the relational model in the early 1980s, there is no competing alternative that can offer such a broad range of uses.

Text retrieval is among the other, widely used, system-level technologies for front office knowledge access but, despite advances in capability and scalability, it is rapidly being outscaled and outdated by newer techniques like CBR, which is more suitable for dealing with larger bodies of unstructured information. Secondly (as discussed in a later section in the context of the Generator), text retrieval techniques can contribute to and be subsumed in most CBR systems, offering their benefits within the emerging common framework but without their traditional limitations.

The search for common knowledge frameworks is not new by any means. As far back as 3,000 B.C., the Sumerians developed a common language that scholars (experts) could use to write down ideas so others could read and use them. Ever since then, we have constantly looked for ways to improve our means of publishing and sharing knowledge. In the 15th century, Johannes Gutenberg introduced the concept of movable type, providing a common, economical way to publish knowledge in books. Instead of restricting information to the privileged few who could obtain handwritten or custom-crafted publications, Gutenberg made knowledge more economical by using a common platform - the printing press. In today's front office, CBR offers a similar breakthrough for providing a common, computer-based platform for storing and retrieving information. In effect, CBR is the expert system for the rest of us - the non-experts who need to integrate corporate know-how with traditional systems and deliver value in their jobs day after day.
It helps people work smarter.

III. Inference's implementation of case-based reasoning is the leader in the market

In 1989, Inference Corporation, under contract to the NASA Johnson Space Command Center, delivered a custom-crafted software system that managed code library selections for NASA's mission control ground-to-ground software systems (Allen & Lee, 1989). That system has been acknowledged as one of the first successful commercial implementations of CBR. In March of 1991, Inference shipped the first commercial product to utilize CBR - CBR Express, a template-based software application designed for use in front office, customer support, and help desk organizations.

Inference today has the largest installed base of CBR products in the world. The company has more than 350 customers and over 500,000 end user licenses, and fifteen companies, including six of the top eight suppliers of customer support software applications, utilize or integrate Inference's CBR products in their offerings (the other two suppliers are currently implementing their own versions of CBR). Inference has licensed its technology to industry leaders such as Compaq, IBM, and Microsoft - companies that have embedded Inference's CBR technology into their products, and many high-profile Inference customers have received IAAI (Innovative Applications in Artificial Intelligence) Conference awards for their use of CBR technologies in real-world, high-return applications. These and many more documented uses of Inference's CBR products display the effectiveness, usability, and return on investment provided by Inference's implementation of case-based reasoning technology.
What makes Inference's CBR so attractive that it has become the market leader? Inference's customers and partners attribute it to a combination of:

Today, Inference is shipping CBR2, the second generation of CBR products. CBR2 can be distinguished from competing CBR implementations by a) its use of a computing architecture that addresses real-world, front office application needs in customer service, help desk, and sales automation and b) its incorporation of the means to author and access both cases (or anecdotal knowledge) and, using a text retrieval model embedded in a CBR-compatible interface, documents - for the world of more dynamic knowledge.

Much of that dynamic information or knowledge is often found in documents such as engineering change memos, technical notes, new product documentation, price lists, etc. The ability to take a knowledge base of information that is continuously reused and complement it with these sources of new and more dynamic information has often led users to augment CBR systems with text retrieval systems. Yet CBR's intuitive, interactive, dialogue-based query model and its weight-of-evidence scoring techniques can be applied to documents, as well. Inference's CBR2 Generator tool automatically indexes documents and adds them to a compatible case base format, which can be accessed through the standard CBR retrieval model. Only Inference has delivered this dual capability. It addresses both case history information and documents, replacing the need for supplemental text retrieval techniques. It also enables CBR based systems to easily incorporate dynamic information from documents, augmenting the information stored in case bases regarding previously solved situations. In addition, CBR2 has integrated other techniques, such as rules, to enhance and refine query techniques through accelerated dialogue management.

Inference is investing aggressively to advance the state of the art and the usability of CBR and is already in the final specification stages for itís next major product family release, called CBR3. Knowledge authoring will be enhanced beyond the features announced in CBR2 Release 1 (see Inference's August 14, 1995, product announcement), and precision and recall capabilities in document retrieval will be improved. CBR3 will incorporate additional features to accommodate decision tree authoring techniques, while retaining the benefits of CBR's weight-of-evidence scoring, which is a key advantage over decision tree systems. In addition, CBR3 will add extensions of adaptive learning techniques (CBR is, inherently, an adaptive learning architecture), providing additional, optional user features for "harvesting" knowledge from call tracking or work flow systems. Standard application program interfaces (APIs) to the authoring engine will let users employ their preferred knowledge gathering and organization tools, such as word processors and spreadsheets, in the authoring process. In fact, the wide range of authoring techniques plus document integration and a proven, intuitive, dialogue-based query model mean that there will be virtually no constraints to prevent user adoption of CBR as a common platform for knowledge storage and retrieval.

Inference has also led the CBR market by offering users a unique, multiple deployment model. CBR2 supports client/server implementations (utilizing the most common relational databases as a repository for storing case bases); stand-alone implementations (complete with a bundled database system from Raima Corporation); a callable CBR interface via Dynamic Data Exchange (DDE); embeddable CBR engines in Dynamic Link Library (DLL) form for Windows 3.1, Windows '95, and OS/2; and Shared Library (SL) for UNIX implementations under Sun Solaris and HP/UX. Embedded versions allow the use of a CBR2 engine inside any other suitable application, offering either a process or a fully customizable user interface. In addition, Inference is now shipping CasePoint Webserver, an HTML-compatible CBR2 server for the World Wide Web. Together, these represent an array of deployment techniques that offer the industry's most flexible means for using and distributing knowledge commercially in software applications. Future additional deployment methods under test include speech-enabled interfaces and wireless PDA implementations.

Finally, Inference has filed six patent applications regarding various aspects of the development of the CBR technology. Currently, two of those applications have been accepted for final review by the US Patent Office.

The implementation of the CBR model, the methods for authoring and managing case bases, and the multiple deployment techniques are a combination that offers users a sound technical foundation for managing unstructured information and integrating it with front office systems. If nothing else, it is a way to keep pace. Despite rapid advances in computing hardware, there are rarely opportunities in software technology that permit dramatic improvements in application quality and usability while simultaneously controlling the cost and simplifying the maintenance of those applications. Case-based reasoning is such a technology.


This paper has reviewed the benefits that can be derived from making CBR (and, especially, Inference's implementation of the technology) a standard for the represention and retrieval of corporate knowledge as a means for ensuring success in the creation of knowledge-intensive, front office applications. In addition to Inference's leadership in CBR development, the company has, through itís Professional Services Group, core expertise in implementing (and assisting our customers in implementing) these systems. This enduring commitment and focus on CBR will allow a sound basis of confidence in decision making toward the use of CBR a strategic component of your corporate information technology plan.


Acorn, T., and Walden, S. (1992) SMART: Support Management Automated Reasoning Technology for Compaq Customer Service. In Innovative Applications of Artificial Intelligence 4, Proceedings of IAAI-92 (Scott & Klahr, editors). MIT Press/AAAI Press, Menlo Park.
Allen, B. (1994) Case-Based Reasoning: Business Applications. Communications of the ACM, V. 37, N. 4, March 1994.
Allen, B., and Lee, S. D. (1989) A Knowledge-Based Environment for the Development of Software Parts Composition Systems. In Proceedings of the 11th International Conference on Software Engineering, Pittsburgh.
Bareiss, E. R. (ed.) (1991) Proceedings of the 3rd Case-Based Reasoning Workshop. Morgan Kaufman Publishers, San Mateo.
Barletta, R., and Hennessy, D. (1989) Case Adaptation in Autoclave Layout Design. In Proceedings of the 2nd Case-Based Reasoning.

[Return to Tech Page]