Insight #3 – Proximity of Words
When you enter words into a search engine such as <john smith>, the search engine will parse such a request into two labels: one called <john> and one called <smith>. It will then go through billions of pages of text looking for instances where the label <john> is next to the label <smith>.
There are two important things to note about this process:
• The results that will get the highest ranking are those where <john> and <smith> are located closest to each other.
• Generally, no preference is given to the ordering of the words. Therefore, the label <smith> could come before <john> and be given just as high a ranking in the search results.
This concept can be extended to three or more words. Suppose you are looking for an ancestor named John William Smith. Entering <john william smith> into a search engine will get parsed into <john> <william> and <smith>. The search engine will look for instances where these words are next to each other.
There are lots of John William Smiths in the world so this kind of search will pull up lots of records. What happens if you search for a more unique name? For example, suppose we were searching for Ezra Pound, the famous American poet. Ezra Pound (whose full name was Ezra Weston Loomis Pound) is not only famous but he also happens to have a rather unique name. It should be easy to find at least one unique ancestral records related to him (for the purposes of this demonstration, we are going to shorten the search name to Ezra Weston Pound).
Here are the search results that come up using the free genealogy search engine:
We chose the example of Ezra Pound to highlight three things:
• To demonstrate how a proximity search works using a real life example.
• To show that a proximity search using three words can be more powerful than doing proximity searches using just two words. As can be seen above, the first search result finds a viable ancestral record.
• To reinforce the fact a name is just a label to a search engine. Our famous American poet has a last name (Pound) that also happens to be a standalone word. This is a fairly common occurrence with surnames. What makes it interesting in this case is the word pound can appear in genealogy records. In particular, pound can be associated with weight. Thus, we can see results appearing that are unrelated to an individual.
This leads us to our next insight...