Pierce Law's IP Mall


 


Home | Search | Sitemap | Kudos | Contact  

  About the IP Mall







About the IP Mall



Web Resources



Pierce Law IP News



IP Mall Resources




Patent Searching Research Archives

ORBIT's PowerSearch: a powerful tool for patent searching. (online database searching)

 Cloutier, Kathleen A.; Kaback, Stuart M.
 May, 1994
       patent information is the primary focus of the ORBIT Online Service
 [1,2]. ORBIT's new PowerSearch capabilities reflect its understanding of
 the unique aspects of patent searching. PowerSearch is a set of commands
 that helps a user customize the searching environment. searcherscan now
 blend results from single or multifile searches, group patent documents
 dealing with the same invention, and flexibly define the concept of a
 duplicate for both bibliographic and patent records.
       MULTIFILE searching
      As a tool for searching multiple files, PowerSearch extends the
 utility of DIALOG's OneSearch and Data-Star's StarSearch. OneSearch was
 introduced in 1987. It searches an entire set of databases at once. This
 approach is satisfactory for some types of searches, such as an author or
 specific term search. Other searches must exploit the unique features of a
 particular database, and this is very difficult to do in a OneSearch
 environment.
      StarSearch was released in 1991. Data-Star recognized the user's need
 to manipulate individual databases in different ways. StarSearch
 sequentially searches each database of interest, either with the previous
 strategy or a new one. Ann Van Camp [3] compared the two approaches,
 beginning her article by stating, 'Multifile searching just got easier.'
      Now it is even easier. ORBIT's PowerSearch approach allows the user
 to expand or narrow the set of databases searchED at any point during a
 multifile session. The results from any search statements can be combined
 as desired. As we shall see, this ability to control and flex the search
 environment is very valuable.
      STN has introduced a similar approach. The MESSENGER software has
 always kept the results of all search statements from all the files used in
 a session. Users can easily go back and forth between files using previous
 search strategies and results as desired. In late 1992, the FILE command
 was expanded to permit simultaneous use of multiple files. As in
 PowerSearch, it is possible in STN to continually change the file domain
 and later merge the results from the sets of interest.
      DUPLICATE DETECTION
      AND GROUPING
      ORBIT's PowerSearch excels in its definition of a patent 'duplicate.'
 In 1989, DLALOG became the first databank to provide bibliographic
 duplicate detection. As Carmen Miller [4] noted, it was 'a searchER'S dream
 come true.' DIALOG has never extended this capability to patent searching.
 STN's duplicate identification includes patents. Unfortunately, duplicates
 in this system are defined in terms of documents, rather than inventions.
 The concept of an invention as a set of related documents is ignored.
 Derwent's World patents Index, a key patent database, is available on STN
 but is not included in the set of files with duplicate detection
 capability.
      PowerSearch can create patent and bibliographic groups of records
 that describe the same invention or source document. A group contains a
 'first record' (decided by file order) against which related and duplicate
 records are determined. The searchER can print all members of each group,
 just the first record from each group, or exclude duplicates or print
 duplicates only.
      ORBIT defines a duplicate patent record as one that 'represents the
 same patent publication and contains the same priority or application
 data.' Related records 'reference the same invention and contain additional
 patent, priority or application number information' [5]. The duplicate
 detection for a patent number links the nine-character Derwent format only.
 The kind or status codes associated with it are ignored. Thus, a published
 European patent application is considered a duplicate of the granted
 European patent for the same invention.
      The grouping capability is extremely useful in pulling together data
 about an invention. Combining fields from a variety of databases leads to a
 more useful set of information. Details of patent claims, indexing and
 family members can be merged without extensive POST-search editing. The
 expanded data can be helpful in screening results for relevancy and
 providing needed legal information. Some users will find that they never
 want to exclude duplicates, but will often prefer to group all of their
 results together.
      PowerSearch grouping for patents was released before bibliographic
 grouping. A few files are available for bibliographic grouping now. Other
 files with this capability are being added. In a bibliographic group,
 citations with the same author and title elements are arranged together in
 a print display. Duplicates are defined as having the same source as well.
 Related records have different sources. Thus, records for a journal article
 and the related conference citation are printed together when they have
 similar titles.
      THE COMMANDS OF
      POWERSEARCH
      How does ORBIT accomplish all of this? The 16-page PowerSearch System
 Reference Guide describes the commands. Many of the new multifile functions
 are expansions of older ORBIT commands. The FILE command can now be
 followed by up to 40 filenames. As before, FILE automatically closes all
 previously opened files and erases all existing search results. PowerSearch
 should be viewed as a single 'FILE' session.
      The ADD and REPEAT commands open additional files for searching. ADD
 simply makes more files available. REPEAT not only adds files, but
 automatically executes all or part of the existing search strategy in them.
 Results from the new databases are automatically incorporated into the
 existing results. The NEW command is used to restrict searching to a single
 file or set of files. A NEW command puts all previously used files in an
 'inactive' state. NEW must be used with a single file when using the patent
 FAMILY command. It is also very useful when using special indexing or other
 features unique to a file. DELETE and SUBTRACT are used to narrow the
 search environment. search results from different files are combined with
 the MERGE command for printing, sorting or grouping. A merged set cannot be
 used for searching, but the REPEAT command and Boolean operations on sets
 from the same files can be used instead.
      Many other commands have been adapted for PowerSearch. Cross-file
 searching and statistical analysis (PRINT SELECT and GET) can be used with
 multiple files. The PRINT command can either display records in file order,
 or SORT can be used to rearrange records by a field. The HISTORY command
 provides a summary of the results for each search statement along with the
 postings from each file searched. FROM may be used after many commands to
 restrict the operation to a single file or set of files.
      A shortcoming of PowerSearch is that the NBR command for browsing
 database indexes has not been enhanced to handle the multifile environment.
 It can only be used with one file at a time. NBR defaults to the first
 active file. ORBIT gives a very clear message that this has happened. To
 search a different file, NBR FROM must be used with the filename. In this
 case, DIALOG and STN are much more advanced with their comparable EXPAND
 functions.
      WHAT DOES POWERsearch
      MEAN TO patent
      SEARCHERS?
      Patent searchersstrive for a reasonable set of search results with
 the highest recall possible. Information obtained from the patent
 literature is often of major corporate importance. It is rare when a search
 run in a single database retrieves all relevant information. Each patent
 database deals with a different subset of the world's patent literature and
 has a different mix of technical and legal information. Some databases
 excel in specialized indexing and others in their organization of the legal
 data. The currency of the files also varies considerably. searching
 multiple files and blending the results fully uses the strengths of each
 database. Adding a file can help narrow search results, find additional
 references, or augment the data concerning a particular invention.
      To be effective, a patent search must:
      1. Use the full capabilities of each database.
      2. Use the information found in one database as a search statement in
 another.
      3. search parts of a strategy in different databases.
      4. Deal with results where multiple records represent the same
 invention.
      5. Review the results to determine which references are relevant to
 the original query.
      The addition of PowerSearch to ORBIT's searching, cross-file and
 statistical capabilities helps the searchER in all five of these functions.
 Item I is achieved with the NEW command, which operates in a single file
 environment and allows the searchER to focus on special indexing and
 features. Cross-file searching accomplishes Item 2 and has been enhanced to
 operate on multiple files. The third operation can be performed by
 searching several files at once or by using the REPEAT command to rerun all
 or part of a previous strategy. A notable (and already mentioned) weakness
 is the inability of the NBR command to work on more than one file at once.
 Some patent files have many spelling errors that must be incorporated into
 a search strategy. Pulling these together with one command would save time
 and be extremely helpful.
      PowerSearch's ability to group records is a tremendous help for the
 fourth item. The searchER must still perform family searching to be sure
 that all related records are in the answer sets. Grouping also assists item
 5. The juxtaposition of enhanced titles, abstracts and portions of indexing
 from several databases makes the screening of results much easier.
      Some images from the World patents Index should also be available by
 the time this article is published. This addition of chemical structures
 from 1992 will be invaluable, as will the availability of electrical images
 back to 1988. Perhaps someday the abstracts from Chemical Abstracts will be
 available as well.
      TIPS FOR USING
      POWERSEARCH
      Here are some things to keep in mind when searching for patent
 information with PowerSearch:
      * The FILE command works the same way it always has! FILE is only
 used once during a PowerSearch session. FILE erases all search results. If
 you have trouble breaking the FILE habit you may want to rename the system
 command to avoid mistakes. Use RENAME to change the command for the current
 session. To make a permanent change, type:
      TERM PROFILE
      RENAME FILE TO xxxxx
      (Choose a name that is not an ORBIT command. Avoid any word that you
 may want to use as a search term.)
      RESTART Y
      * When you are ready to group documents, the order of the files is
 very important. The number of duplicates identified depends on the file
 order. The record from the first file is always kept. For example, a search
 finds a USPM record citing a U.S. patent and a WPAT record that lists the
 same U.S. patent as well as other equivalents. No duplicates will be found
 when the file order is USPM, WPAT. If WPAT is the first file, the USPM
 record will be identified as a duplicate. REORDER is the command to change
 the sequence of files.
      * If your search uses Chemical Abstract Registry Numbers, using PRINT
 HIT ITT will give the indexing and the roles for them. This extra
 information can be very useful in screening results.
      * Grouping pulls information about family members together but is not
 a family search. It is still necessary to search for the related records.
      * Kind codes are not considered unique. Information about a granted
 Patent may be lost when duplicates are removed from a set. Remember that
 duplicates are removed based on which database is keyed first when entering
 the cluster.
      * Cross-file searching is a powerful technique that may still be the
 most effective way to perform many searches. I found it useful to search
 information for use in subsequent files. PRINT SELECT is ORBIT's command
 that transfers terms to a SELECT list. Each SELECT list can be named by
 using the TOSEL option. It is very helpful to refer to the saved lists by
 name. You must use PRINT SELECT if you would like to have a Derwent record
 for every citation found in other files.
      * XCLAIM is the command to move results from an expensive CLAIMS file
 to a less costly one. In general, XCLAIM works very well in PowerSearch. It
 appears to work, but fails when the file you move to has already been used
 in a session. In this case, the expensive file remains open and is used for
 searching. (As a result of reviewing this article prior to publication,
 ORBIT is in the process of fixing this problem. - NG)
      * MERGE is used for printing. Use REPEAT, or other logic to create
 searchABLE sets from multiple files.
      * If none of the specified print fields exist for a record, only a
 message indicating the problem is given. Printing the item number in the
 set in a different format will obtain some information about the missing
 record.
      * The SUBS (subheading) command remains in effect for the entire
 PowerSearch session. (SUBS CANCEL turns it off.)
      A CLOSER LOOK AT
      POWERSEARCH PATENT
      GROUPING
      The search in Figure 1 uses several features of PowerSearch to
 collect and group a set of patents mentioning minoxidil. The goal of the
 search was to obtain a manageably large set of records from several
 companies and countries and examine the results. I wanted to obtain
 Derwent's family information for every invention found by searches of
 Chemical Abstracts, CLAIMS, Current patents, INPANEW and WPAT.
      To do this, I extracted the patent, application and priority numbers
 from the search results of the various files. The WPAT search combined
 these results with the additional records found by a single file search of
 WPAT. I extracted and RE-searchED the Derwent priority numbers to find
 additional related records. All 518 of the records from all of the files
 were then merged. After reordering to make WPAT the first file, an ID
 display reported 227 patent groups containing 243 duplicate patent records.
 I studied these groups to become more familiar with grouping.
      I was impressed by the success of the cross-file searching. The
 search of Chemical Abstracts had patents from 13 countries and covered the
 time period 1975 through November 1993. All but one patent group had one or
 more records from WPAT. The exception was an Australian invention not in
 the WPAT database. An INPADOC family search did not show any equivalents in
 other countries. Chemical Abstracts and INPADOC are the only ORBIT
 databases that cover this invention.
      The records in each patent group are sorted in file order. Duplicate
 records are clearly labeled. The groups are arranged in reverse
 chronological order for the newest record from the first file. All groups
 with records from the first file print first. In the minoxidil search, the
 last group did not have a WPAT record. Browsing the end of the file makes
 these easy to spot. This technique also spotlights non-convention cases
 that have not been tied to their convention equivalent, and so helps the
 searcher make such connections.
      I didn't find anything wrong with the grouping. WPAT records that
 were related to other WPAT records by priority number were nicely tied
 together. One case even added a WPAT record that didn't have a common
 priority or application number. This group is shown in Figure 2. One of the
 12 application numbers in the CLAIMS record is the only link to Derwent 89-
 070403/10. On their own, the three WPAT records group into two patent
 groups. Adding the Chemical Abstracts records does not change this. It's
 the CLAIMS record that pulls it all together. The XR cross-reference field
 in WPAT for 92- 323296/39 does list both of the other records but the
 priority link has been omitted from the database. (This omission is also
 present in the DIALOG file.)
      Another advantage of the merged results is the ease with which
 irrelevant Derwent records found by a search for related documents can be
 identified. These records share a common priority number but deal with very
 different subject matter and belong to different companies. PRINT PATGROUP
 pulls them together.
      A further value of grouping is the ability to bring together as much
 information about an invention as possible and desired. The differently
 enhanced titles from several databases combine with indexing aids in
 screening search results. The detailed U.S. application data from CLAIMS is
 a useful addition to the data from the other files. I have often manually
 edited search reports to obtain similar results. The ability to issue a
 command and have the editing automatically performed is very helpful. The
 direct benefit of grouping is having fewer separate inventions to screen
 for relevancy and more data to help evaluate the results.
      WISH LIST
      PowerSearch in its initial form has succeeded in assisting patent
 searcherswith retrieval and analyses of patent information. Several
 enhancements would make it even better:
      * The NBR command must be expanded to operate with true multifile
 searching. Database indexes have many inconsistencies such as spelling,
 terminology, inventor and assignee names. It is inefficient to specify each
 file separately. Providing a multimeaning display for use with truncation
 in multifiles would be an acceptable patch for the short term. (This would
 work like the ROOT command on other systems.) However, NBR is much more
 desirable.
      * It would be extremely useful to see grouping evolve so the user can
 specify different print fields for each database. ORBIT has indicated that
 it is heading in this direction. The next step should be to make it
 intelligent and flexible. I do not want repeated copies of the same
 assignee information for an invention, but I also do not want to miss cases
 where it varies. (I was able to print different formats for each database
 in an STN multifile session. I created a format in each file and saved it
 under the same name.)
      * Can cross-file searching be streamlined or customized? I grew
 accustomed to typing PRT SEL SET PN, PR, AP TOSEL xxxx but often made
 typing errors.
     * Including cross-file commands in HISTORY would be valuable.
      CONCLUSIONS
      In 1983, Stu Kaback [6] fantasized about a master file of patent
 information incorporating the inputs of CAS, Derwent, API and IFI. At that
 time, cross-file operations could be used to perform some aspects of this
 concept. In 1987, when OneSearch had just been announced he mused:
      Now, file cluster searching, or OneSearch, is likely to save time
 used in searching multiple files one after the other, but as soon as I saw
 it, I wondered about a possible enhancement. When you're cluster searching,
 each of the records is merely the record from an individual database. What
 about the possibility of an algorithm that would tell the computer to
 combine all records with a common bibliographic element, most probably the
 priority number for patent files. Then each patent would have in one
 superrecord the Derwent codes, the CAS Registry numbers, the CLAIMS-CDB
 terms and roles, the API terms, the WPI family, everything . . . Here one
 would have the best of all worlds, in which every patent family would be
 searchABLE by ALL of the attributes assigned to it by all databases. [7]
      ORBIT's ability to group duplicate and related patent records from
 various databases is another significant step toward this ideal world. No
 other online service has successfully brought patent records dealing with
 the same invention together. The searchER still needs to do meticulous
 cross-file searching but the final printout organizes the results in a
 useful way that previously required extensive post-processing to
 accomplish. The 'super record' concept for searching and displaying is not
 fully implemented, but it is getting closer. For many patent searches,
 ORBIT should be the databank of choice.
      Cross-file searching
      And implied File
      Mergers
      Cross-file searching should not be confused with multifile searching.
 In multifile searching a group of files may be searched simultaneously, but
 each file is queried with only its own parameters, or with a common set of
 parameters. In cross-file searching, parameters extracted from one file are
 brought to bear against a second file, enabling the selection of those
 references that include both Term A from File AA and Term B from File BB,
 thus allowing the viewpoints of the two files to interact. Multifile
 searching is certainly convenient, but cross-file searching is a far more
 powerful technique.
      File mergers to produce the utopian All-the-Parameters file of my
 fantasies are unlikely to happen. The WPI-APIPAT merger gave us a wonderful
 product, but the experience of ORBIT, Derwent, and API in combining these
 databases showed that such mergers are highly complicated (read costly),
 even in a situation that seemed made-to-order for such a merger. But
 PowerSearch can, in some instances, permit virtual file mergers, a concept
 I first heard enunciated by Nancy Lambert of Chevron.

 File BB with a set of parameters including Term B. Merge the two sets and
 PRINT GROUPS. Those groups that include a reference from File AA and its
 equivalent from File BB turn out to include both Term A and Term B. It's
 very much like a merger of the two files - Nancy's virtual merger.
 The concept has tremendous potential for high-powered searching.
      ACKNOWLEDGMENT
      I want to thank ORBIT and STN for allowing me to freely explore their
 systems. A special thanks to Sarah Radick of ORBIT for her helpfulness and
 quick response to all of my thoughts and questions.
      REFERENCES
     [1] Gregory, Andrew. 'Maxwell Online And The Future . . . InfoPro
 Technologies.' ONLINE 16, No. 6 (November 1992): pp. 10-11.
     [2] InfoPro Technologies announced the sale of ORBIT Online Service to
 Questel on February 1, 1994. See the newspages in ONLINE, March 1994, p.
 13.
     [3] Van Camp, Ann J. 'StarSearch For The Health Sciences.' DATABASE
 14, No. 5 (October 1991): pp. 99-101.
     [4] Miller, Carmen. 'Detecting Duplicates: A searchER'S Dream Come
 True.' ONLINE 14, No. 4 (July 1990): pp. 27-34.
     [5] PowerSearch System Reference Guide (November 1993).
     [6] Kaback, Stuart M. 'Online patent searching: The Realities.' ONLINE
 7, No. 4 July 1983): pp. 22-31.
     [7] Kaback, Stuart M. 'What's News?' World patent Information 9, No. 4
 (1987): pp. 258-259.




The Pierce Law IP Mall
Pierce Law
2 White Street, Concord, NH 03301
603.228.1541   fax: 603.228.2322   email: ipmall@piercelaw.edu   web: www.piercelaw.edu


  Prof. Jon Cavicchi, Site Director | jcavicchi@peircelaw.edu




  © IP Mall , 1995 - 02.  All rights reserved.  

  Produced by: Sitesurfer Publishing LLC