TOOLS FOR PATENT SEARCHING
Author Withdrew his Name
Imagine yourself on the deck of a large battleship looking out into the horizon with a pair of high powered binoculars looking for an oncoming vessel. Night falls, you are still looking into the horizon in search of your objective. You may stand on the deck day and night, using the high powered binoculars looking into the horizon with pinpoint accuracy and never find another ship. Now, imagine yourself sitting in front of a radar scope, within seconds you notice a blip on the screen, indicating the presence of another ship. This is the accuracy and ease that Natural Language Process software offers a person searching for a specific topic in the vast sea of knowledge.
Technological advances created the ability to obtain vast amounts of information in a short amount of time, a feat not possible 30 years ago. Imagine the added complications now that worldwide access to information is commonly available and the amount increases exponentially daily. In a click of a button computer users can search the contents of encyclopedias, government records, or technological journals in a matter of seconds. However, having the capability to obtain information is not equivalent to reviewing and understanding the information.
The advent of the internet and large computer databases compounded the problem of information management. Presently, the amount of information available overwhelms a single individual or a small group of people. This large amount of information has overburden the traditional analysis systems designed for a simpler time. When information was limited to one database or a single source, a single searcher could understand the contents of the database without further help. However, now with global information and multiple databases providing information developed worldwide, the single searcher is impractical.
The solution proposed to handle this overwhelming amount of information is to computerized the information search. A method commonly used is the Boolean search. The Boolean search requires the user to input a series of single words connected by limited means, operators. The limited means may include connectors such as "and," "or," "not," or an express limitation to find the words within a sentence or paragraph. The Boolean search is advantageous for experienced searchers who have a clear understanding of the query, as well as the limitations of the database. However, Boolean searches can be difficult to the uninitiated. A poorly developed query may result in thousands of possible answers or "hits." Further limitations to the search may lead to an overly restricted result with few or none hits. Additionally, relevant hits may never be accessed because of the query constraints, specially when the searcher fails to introduce synonyms into the query, for instance, car and automobile or even vehicle. A Boolean search will use only the specific word described in the query, it will not introduce an equivalent word or phrase that may be used to describe the object of the query. This shortfall creates problems requiring multiple searches. Additional problems arise with the implementation of a Boolean search.
The Boolean search takes time to learn and even more time to master. To master a Boolean search, a user must spend time learning how to connect properly words with operators, how to use synonyms, and how to limit the search to obtain a manageable result. The learning process basically consists of trial and error, which can be frustrating and time consuming. In the current competitive environment, businesses can not afford to allocate time and resources to train every worker to master Boolean based searches.
Many companies are working to develop methods to access, analyze, and report information quickly, in essence, streamline the process of acquisition and analysis. Companies are also developing user friendly systems to aid the over worked researcher to obtain information quickly without learning new systems. These new systems are capable of analyzing all types of information, however due to time limitations, this paper is directed to those systems that analyze patent information.
This paper reports two products for patent analysis MAPIT by Manning and Napier and SmartPatents IPAM 5.1. The analysis range varies depending on the product. MAPIT has a very thorough web site that allowed for hands-on use of their system and the company provided vast amounts of literature information. SmartPatents provided literature information and allowed access to the software, however not the ability to conduct a search.
This paper will address the scope of the product and contact information. Under the scope of the product the analysis includes the method of search, results displayed, and the ease of use. Contact information refers to the ease of contact and help available either through the web site or the phone interviews.
Manning and Napier's MAPIT uses natural language processing, "NLP" for inquires. In contrast to the Boolean search, used by SmartPatents, which requires operators to connect words. The NLP allows the user to input a query in natural language, that is normal conversational language. NLP analyzes the query for exact words as in the Boolean methodology, and also analyzes the query to determine the relationship between words and phrases to produce a search result containing the same or similar words and phrases used in the query. This simplifies query inputs by allowing the searcher to place an idea, a simple description, of the query to obtain exact and similarly related objects. Thus, the search result is not dependent on the user's expertise in connecting Boolean operators and words, as required in other systems such as SmartPatents, Westlaw, or Lexis.
Mapit by Manning and Napier
MAPIT adopts the tested methodology of DR-LINK (Document Retrieval through Linguistic Knowledge) and applies it to patent analysis, claim analysis, and portfolio comparisons. DR-LINK is an intelligent text retrieval system based on linguistic principles. It offers an English language query interface, relevance ranking, concept matching, synonym expansion, acronym expansion, disambiguation of terms, proper noun recognition, and identification of noun phrases. MAPIT uses the NLP program DR-LINK developed by Dr. Elizabeth Liddy, President of TextWise, Inc., and Associate Professor, Syracuse University School of Information Studies.
Patent searching has always been a time consuming endeavor. It takes time to search through the five million plus patents issued by the United States Patent and Trademark Office to find a specific piece of information. MAPIT provides an easy way to review the patent database based on a simple natural language query.
MAPIT is a secure online system that understands the structure of patents and can compare patents, patent claims, and patent portfolios. MAPIT combines familiar keyword (syntactic) searching with the power of linguistic (semantic) searching. MAPIT then compares keyword matching (syntactic matching) and linguistic matching (semantic matching) improving the results of the search over a simple Boolean search. Because the search incorporates the linguistic matching, the result contains words that may not exactly match the keyword, yet still contain the implied, associated, or overall meaning of the query.
The first step using MAPIT is to develop a dataset. This is done because the computer algorithms used to extract linguistic meaning requires fairly high-powered computer systems and time. Manning & Napier's development laboratory uses a powerful Sun SPARC multiple processor server with 512 megabytes of RAM and 70 gigabytes of hard disk as a server, linked to an even more powerful IBM SP-2 Massive Parallel Processor with 12 nodes, accessing over 600 gigabytes of information, and connected again to an even larger system with literally hundreds of powerful processors.
This process offers several advantages. The search queries are in natural language just as if speaking to a research assistant. Once the dataset is complete, subsequent searches performed using MAPIT can respond quickly to different search scenarios within the dataset. Thus, MAPIT allows for greater flexibility in the search with no delay, an advantage when time is of the essence. Even though the system requires a powerful computer, the user simply needs a standard computer with internet access.
MAPIT requires an Internet connection and Netscape Navigator 3.0 or above, with JavaScript and Java enabled. As of October, 1997, MAPIT would not work with Internet Explorer or other non-Netscape browsers, including America Online. However, additional access is possible through direct extranet or telephone connection for qualified clients. The user simply sign-ins onto the Manning and Napier web pages, enters their password, and the search may commence.
Because MAPIT can be accessed securely over the internet, multiple site access is possible. The user need only access the internet to perform a search. Searchers at different sites can contact MAPIT, conduct searches, and then compare their work with colleagues at different locations. MAPIT literature refers to the process of searching the database as "mining." Based on the analogy of searching vast amounts of ore, in this case information, to obtain the few significant jewels, relevant information.
MAPIT's website provides a walk through display. The display is easy to understand and provides some basic information. However, there is no information as to the content of the database. For this article, MAPIT was never used. A product demonstration was not possible, so this article is based on information provided by MAPIT through product literature and information from the website.
The Search
To start a search under Database the searcher must select from Portfolio Analytics, Industry Verticals, Products, and World Patents. Portfolio Analytics datasets compares one set of patents against a second set. This is helpful when comparing the patent portfolio strength of two companies. To understand the portfolio breadth and depth, the searcher can map a single set of patents, using MAPIT to compare each claim from each patent with every other claim and patent. Industry Verticals datasets permit tracking of a company within the context of the industry as a whole. At the time of this article Industry Verticals dataset was still in development. Products datasets compares patents with product data to help find infringers and potential licensees. At the time of this article Products database was still in development. World Patents compares patents worldwide. At the time of the this article World Patents dataset was still in development.
The second step consists of accessing the Analytics section of the program. The Analytics technology is based on linguistics and natural language processing research. Analytics offers several methods to input a query, Concept Query, Patent Query, Claim Query, and Range Query. Under Concept Query the searcher may enter a natural language request with up to 50 words. Examples provided by Manning & Napier include: "fabricating a FET with multiple semiconductor substrates with different conductive surfaces, including epitaxial layers with various levels of doping" and "I would like information about computer technology used to control the orientation of packages or part on an automated product line." Under Display Options the search can cover patents, claims, and patents (all claims or best claim). Furthermore the results can be sorted by relevance, theme score or phrase score. A thesaurus will soon be available to aid the searcher.
Once the query is analyzed, a display shows a break down of the relevant words and the number of patents and claims containing such words. The Concept Query Results display places the patents and claims in the order requested under relevance. The relevance rating is based on stars, the more stars the more relevant the individual result. The display contains a brief description of each claim. To observe the relevant patent and claim in full, the searcher simply clicks on the rank number. Each group is displayed in a set of 25 hits.
Under Patent Query the seashore enters the patent number. The search will find all similar patents, using multiple, independent scoring metrics. The Display Options can limit the search to patents (all claims or best claim). The results can be sorted by phrase or theme score. A display of the results is similar to that obtained in Concept Query. Each relevant patent is scored based on a phrase and a theme score which is displayed on the right hand side in deciding order. The title of patent, patent number, and rank number are displayed on the Patent Query result list. As in the Concept Query Results, the searcher can click on the rank number to view any patent.
Under Claim Query the searcher enters the patent number and claim number. The Display Option searches the claims in the database and sorts the results by Phrase Score or Theme Score. The Claim Query organizes the results by descending order of relevancy depending on either the Phrase or Theme score, which is displayed on the left hand side. Like the Concept Query and the Patent Query, the results contain a small description of the claim language for each patent and claim displayed. The result is accessed by clicking on the patent number and claim.
Under the Range Query, the search may be limited by selecting either Phrase Score or Theme Score. Furthermore, the searcher can set a score range to reveal the most relevant results. The display compares two patents, claim by claim. The display contains a small description of each compared claim and the relevant scores. The searcher can access the claims by clicking on rank to view a side-by-side comparison of the two claims, click on a patent number to view the full text of the patent, or on the claim to view the full text of the claim.
MAPIT displays the results in several easy to understand graphic forms. Under the Visualizations subheading in Analytics, the searcher can view the results under a Cluster Plot, a Two Dimensional plot, and Three Dimensional plot.
The Cluster Plot represents a comparison of patents based on Phrase and Theme score. The vertical axis (Y axis) analyzes for similarity from a subjective perspective, theme score. The horizontal axis (X axis) analyzes for similarity based on revealing terminology and phrases, phrase score. Each axis has a range between 0 and 100. The lower numbers indicate dissimilarity while higher numbers reflect greater similarity. The lower left hand corner of the graph represents the matches with the lowest similarity in both the theme and phrase score, while the upper right hand corner represent those with highest similarity.
The graph represents over 85,000 pairs of claims. Naturally, in the majority of matches reside in the lower numbers, giving the appearance of a smudge on the graph. The upper right hand corner has few matches.
The two dimensional graphic visualization compares the patents in the result against each other. The graph is symbol and color coordinated to represent relevancy. Symbols represent scores between 80-89 (gray), 90-94 (blue), and 95-100 (red). Inserted within the graph is a table of patents which uses arrows to indicate the matching pairs. The searcher can access the top pairs by clicking on the respective link. The Top Pairs display shows the patent numbers and their rank. To view the patent or claims simply click on the corresponding link.
The three dimensional graph compares one set of patents against another set of patents. The result represents similarity based on a ranking system similar to the two dimensional graph. Access to the matches is the same as discussed for the two dimensional graph.
MAPIT displays the result in an easy to understand visualization by order of relevance. The system is intended for users who do not have time to learn a query system or the ability to string coherently a series of commands. With the ever expanding use of software, perhaps this is a good choice because users no long have to invest time learning a new system. The users can simply write what they want in general and vague terms and the computer will do all the thinking for them. Also the system permits full text dumps, that is, vague queries without sorting relevancy. The computer takes care of the latter. The system looks at the ideas involved in the string of words, contrary to a Boolean system that specifically searches for words.
Contact Information
Manning and Napier Information Services
1100 Chase Square
Rochester, NY 14604 (716) 325-6880 or 800-278-5356
Fax (716) 325-1036
info@mnis.net
http://www.mnis.net
SMARTPATENTS
SmartPatents is a Mountain View, California, based software developer founded in 1992 by Silicon Valley businessmen and patent attorneys Kevin Rivette and Irving Rappaport. SmartPatents offers the Intellectual Property Asset Management ("IPAM") system which is a windows based patent analysis system. A product demonstration was not possible. This paper is based on literature information provided by SmartPatents.
The IPAM 5.1 System consists of software and database products. The IPAM 5.1 Server software resides on a Windows NT server hardware which supports SMP systems. The software offers a secure intranet connection, a flexible and open client/server architecture, and a Windows NT or Windows 95 based solution using standard HTTP protocols. The software can be used on any PC terminal with internet capabilities. The data products include all U.S patent since 1972, repositories of EPO patent since 1978, and repositories of PCT applications since 1978.
The IPAM 5.1 system includes the SmartPatent WebBench. The WebBench works with a standard browser enabling searching, viewing, and printing patent text and images. The WebBench includes modules for patent analysis including patent grouping, patent text and patent images which are hyperlinked, hyper-annotation capability to link user-created notes across multiple documents, group notes, and figure to text links.
Other modules offered by IPAM 5.1 includes a Hypercitation Analysis Module and a Business Decision System Module ("BDS"). Users can see who is citing their patents and how often with the Hypercitation Analysis Module. After the users create and view citation groups based on a search criteria, they can evaluate the technological competitiveness as well as the intellectual property leaders in a particular field. Furthermore, with this tool, the user can determine the licensing opportunities within a field. The BDS Module allows users to run pre-designed reports based on the Bibliographic Indices and custom reports by linking patent data to their corporation's own internal operational data. The BDS reports automatically sort, group, and summarize patent data to help identify key issues and trends. Users can conduct inventory analysis, assignee analysis, patent analysis, citation analysis, and grouping reports.
SmartPatent's IPAM 5.1 System's software product identifies, organizes, and analyzes company's patents and portfolios. The IPAM 5.1 System software includes a built-in reporting system that includes Patent to Product Mapping, Patent Aging, Cluster and Bracket Analysis, Inventor Employment, and Inventors to Patents Grouping reports. Additionally, the software provides citation analysis visualization.
Patent to Product Mapping assigns patents to products and then integrates patent data with other operational databases such as Manufacturing and Distribution, Finance, and Marketing. This method of comparing patents to their respective corporate resource allows for greater flexibility in production management. A company can quickly view the effect a patent has on production, distribution, and marketing to determine whether to pursue the line or allocate the used resources for other more important patents. Furthermore, a company can review profitability and licensing opportunities of a patent. The availability of information on several levels allows management to make strategic and tactical business decisions to maximize output in a rapidly moving environment.
With Cluster and Bracket Analysis, companies can ascertain whether they have protected their core technologies. A company can quickly view whether they possess key technology that protects their main products, or whether a competitor possesses technology that may interfere with the company's market position. A company can decide to obtain, through licensing if necessary, technology to protect a line of products, or whether to develop and patent the necessary technology to protect the core technology.
Patent Aging serves as a chronological clock to monitor the lifetime of a patent. It provides the company with an overview of patents about to expire. This information is helpful to a company to decide whether to acquire or develop additional technology to protect its core products.
Inventor to Patent Counts links inventors to the patents they developed or owns. This helps pinpoint relevant inventors in a field of interest. Additionally, it permits companies to assign inventors projects for which they have expertise. This method maximizes human resources within a company and allows management to place key people on urgent projects. The Inventor Employment Report tracks inventors within the company. The Inventor Employment Report allows management to monitor which inventors may have left to work for a competitor.
IPAM's electronic documents come in three versions: Standard, Plus, and Pro. The Standard version provides the text on the left half of the computer screen and the actual image of the paper patent on the right. The Plus version offers the added feature of text to page linking. The text on the left side of the screen is linked to the actual page of the patent image and figure references in the text are linked to the image page. Some documents are not available for PCT and some EPO patents. The Pro version offers line to line and figure to figure linking. The lines of the text are linked to the lines on the patent and figure references in the text are linked to the image page. Pro documents are not available for PCT and some EPO patents.
A SmartPatent IPAM 5.1 search is a Boolean type search. WebBench offers a selection of fields which include Patent No., Title, Assignee, Issue Date, Filing Date, Class, Int'l Class, Inventor, and several keywords to conduct a search. The searcher inputs the variables and the program displays the relevant patents. The system is similar to that used by the U.S. Patent and Trademark Office, in the Patent and Trademark Office Depository Library System. However, a significant difference is the scope of the database used by IPAM 5.1.
SmartPatent IPAM 5.1 Workbench provides several helpful features for patent searching. Workbench permits a side by side display of hyperlinked patent text and images and hyper-annotations linked to patent text and/or groups. Furthermore, a Workbench can search a patent database based on enterprise-defined keyword, abstract, claims-only, and fulltext, and save user-defined search definitions in the library. Additionally, the search can be limited to "quick search" or first page skimming.
IPAM is a useful tool for intellectual property management. The rapid access, cross referencing, and organization allow any manager a quick overview of a company's intellectual property. Thus, the manager can decide strategy for corporate development more attune to intellectual property protection.
Contact Information
SmartPatents, Inc.
1975 Landings Drive
Mountain View, CA 94043
info@smartpatents.com
Tel: (650) 237-0900
Fax: (650) 237-0910