#Finding scientific information: Truncation of search terms

Truncation of search terms

Truncation symbols

Truncation of the search terms allows the programme to include conjugated forms of the word. Databases have different systems for truncation. Normally, a truncation symbol is added at the end of the word, but in some databases, the symbol can be added in front of the word. The truncation symbol varies according to the database and you should check this in the instructions. The most commonly used truncation symbols are: asterisk *, question mark ?, dollar $, hash mark #, exclamation mark !

It is not advisable to run a search with too short a word stem, because there will probably be more words with the same combination and sequence of letters, but with a different meaning, and there is a danger that the search result might easily include irrelevant references. Therefore, some databases have set a minimum length for a word stem.

Some databases use automatic word truncation. This looks for all the words that contain or start with the same set of symbols as those written in the search box. However automatic word truncation doesn’t recognize stems of words like in an automatic stemming system. Thus viewer finds viewers but not viewing.

Automatic stemming

Some databases use an automatic stemming system, which recognizes the stems of words and searches for words containing that stem. For example, if your search word is 'viewer', the results given by the search engine contain references which include words with the stem 'view', such as view, viewing, preview etc. In some databases, a word has to be written in its basic form, in order for automatic stemming to work.

Stemming rules, however, are database specific and need to be checked in the instructions of the database.  For example in some databases stemming applies only singular and plural forms of words, in many databases stemming doesn't work in phrases, and some databases compare word stems only with words in their own thesauruses, and cannot find all possible conjugated words. If you would like to look for a word only in its exact written form, the automatic stemming system has to be switched off according to the instructions given in the particular database.

In the arXiv database, the search engine automatically looks for the word written in the search box in all its different forms.
Source: arxiv.org <http://arxiv.org> 12.12.2008.

In Scopus database, for most words, a singular form of a word finds automatically also plural and possive forms but not other stems of a word. The truncation mark in Scopus is * (asterisk).   Source: Scopus <http://www.scopus.com> 28.6.2011

In the Pub Med database, the search engine compares a given search term by default with terms in the medical thesaurus MeSH. On the Details tab, one can then check which search words were actually used in the search.
Source: PubMed <http://www.ncbi.nlm.nih.gov/pubmed/> 28.6.2011.



Substitution symbols for letters

A substitution symbol substitutes usually exactly one letter. Some databases provide also a substitution symbol for 0-1 letters. By using a substitution symbol you can avoid writing a word's different spelling forms (organization-organisation) and irregular plurals or conjugation forms (woman-women). Usually a substitution symbol can be repeated. Substitution symbols are database specific and their use should be checked from the instructions given by the particular database. An automatic version for a truncation symbol is a fuzzy search, which looks for similar words at the same time as it compares other contents of the reference in order to preserve high relevance; as yet, however, it is used in very few search engines.

In the Web of Science database, one symbol is substituted by ?, whereas $ can substitute for either one symbol, or a missing symbol. Here American and British English spelling forms 'behavior' and 'behaviour' as well as the irregular plural for 'woman' have been taken into account.
Source: Thomson Reuters - Web of Knowledge (Web of Science) <http://apps.webofknowledge.com> 19.7.2013.

In some cases, a substitution symbol can be repeated, as in this example from Web of Science.
Source: Thomson Reuters - Web of Knowledge (Web of Science) <http://apps.webofknowledge.com> 19.7.2013.