Terry's ORA Tips

Transforms

This page updated 10 Feb 2024

 

This article describes what I believe are the most useful Transforms offered by Online Repository Assistant (ORA). Transforms are used to modify the data ORA obtains from genealogical services to suit the user's preferences. Other articles in my ORA Section cover other topics about using the software.

Topics Included in this Article
Formatting Dates
Changing the way dates are output
Date Arithmetic
Adding or subtracting days, months, or years to a date
Separating Parts of Names
Separating parts of a Name field
Surname First Names
Changing surname-first name to normal order
Separating Parts of a Field
Separating Place and other parts from a combined field
Counting Parts in a Field
Counting how many parts there are in a multipart name or place field
Replacing Parts of Text
Replacing or removing characters in a field
Extracting Parts of Text
Extracting desired characters from a field new 10 Feb 2024
State Names and Postal Codes
Converting State names to postal codes, or the reverse
Changing the Case of Text
Uppercase, lowercase, initial caps, have it your way
Changing Numbers to Words
Spelling out numbers
Using Multiple Transforms
When more than one Transform is required to achieve your result
Conclusions
Additional comments and links on Templates

My article on Template Basics describes how to construct Templates in ORA to manipulate the data collected and displayed in the OraPanel, or typed into your genealogy program. Transforms, as their name suggests, "transform" the data collected in a variety of ways so that the data collected appears in the output in formats you may prefer.

This article describes the most useful, in my opinion, of the many available Transforms. For information about others not described here see ORA Help.

Transforms are entered inside the square brackets that enclose the name of a Variable, following the name and separated by a colon. The general format is:

[Variable:Transform]

Some Transforms have parameters that control exactly how they work. For a Transform with parameters the general format is:

[Variable:Transform:parameter1:parameter2]

The following sections describe some of the available Transforms in detail.

Formatting Dates

My article on ORA Basics describes how to set the format ORA will use to present data items recognized as dates. You would generally set this to the format you will use most often. Template Variables referring to fields with dates will output them in the selected format. If none of these choices suits your preferences, or if you want different formats for different purposes, Transforms are available to change the format output by a specific Template.

The date Transform uses the term ":date" followed by one or more parameters. Each parameter is a series of code letters separated by literal text characters. The codes for the date elements are listed in the table below :

Code Output
d day, written as a one or two-digit number
dd day, written as a two-digit number, with leading zero if needed
m month, written as a one or two-digit number, e.g. 1, or 11
mm month, written as a two-digit number with leading zero if needed, e.g. 01
mmm month, written as a three-letter abbreviation in lower-case, e.g. jan
Mmm month, written as a three-letter abbreviation in mixed-case, e.g. Jan
MMM month, written as a three-letter abbreviation in upper-case, e.g. JAN
mmmm month, written out in lower case, e.g. january
Mmmm month, written out in mixed case, e.g. January
MMMM month, written as a written out in upper-case, e.g. JANUARY
yyy year, written as a three or four-digit number
yyyy year, written as four digits, with leading zero for years before 1000

The characters separating the day, month, and year codes can be most character strings. The letters d, m, and y, and characters with special meaning to ORA, like colons, angle brackets, and square brackets, must be preceded by the escape character – " \ " – as in the last example below.

Some examples of date Variables with different Transforms are listed in the following table, where the Date field contains the date 24 March 1950:

Variable and Transform Sample Output
[Date:date:yyyy-mm-dd] 1950-03-24
[Date:date:Mmmm d, yyyy] March 24, 1950
[Date:date:d Mmm yyyy] 24 Mar 1950
[Date:date:\date\: d Mmm, yyy] date: 24 Mar, 1850

The last example illustrates that the output of the :date Transform can include any literal text as long as an escape character is before any characters that have special meaning. In this example that applies to the " d " in "date" because it would otherwise been seen as a request to output the day, and before the colon after "date" because it would otherwise be seen as a separator for another parameter.

These examples use only a single parameter, which is applied in all conditions. When the second, optional, parameter is present, it is used when the date contains a month and year but no day. The third is used when there is a year but no month or day. When these parameters are omitted the format specified by the preceding parameter is used in those cases, with the missing day or month output as "0."

Date Arithmetic

Sometimes we may want to compute a new date based on a date that appears in a field. For example, we may want to compute an estimated date of burial from a record that provided a date of death. The dateAdd and dateSubtract Transforms will add or subtract a specified number of days, months and years to a date that appears in a record. The date must include year, month and day parts, and must not include a modifier like before or after.

The dateAdd and dateSubtract Transforms use the terms ":dateAdd" and ":dateSubtract" followed by a parameter to specify the number of days, months and years to be added or subtracted. They are entered as numbers followed by the letters "d" "m" and "y" to indicate days, months, and years, respectively, each separated by a space when there is more than one. If no letter is entered the value is assumed to be days.

Some examples of date Variables with different Transforms are listed in the following table, where the Date field contains the date 24 March 1950:

Variable and Transform Sample Output
[Date:dateAdd:2] March 26, 1950
[Date:dateAdd:1y] March 24, 1951
[Date:dateAdd:1y 1m 1d] April 25, 1951
[Date:dateSubtract:2m] January 24, 1950

The Transform Help page includes discussion of cautions when using these Transforms when the dates involved are not using the Julian calendar. It also lists an number of other Transforms for dates.

Separating Parts of Names

Many records have a single field that contains multiple parts of a person's name, for example title, given name or names, surname, and suffix like Jr. or Sr. The name part Transforms can be used to output individual values from such a field when desired. The name part Transforms recognize four parts. The associated Transforms are: namePrefix, nameGiven, nameSurname, and nameSuffix. The surname part includes any "pre-surname" like van, de, or O'.

Separating name parts is a complex exercise because of the various ways names are recorded in the original document, and indexed. Differing conventions in different cultures create additional variations. ORA's name parser uses certain clues to attempt to extract the desired parts, and thus depends on the name appearing in the record in an expected format. The various parts are determined as follows:

Some examples of name parts produced by these Transforms are listed in this table, where the Name field contains "Rev Robert Q. la Jones, Snr" and the Optional Name Part Characters box is checked:

Variable and Transform Sample Output
[Name:namePrefix] Rev.
[Name:nameGiven] Robert Q.
[Name:nameSurname] la Jones
[Name:nameSuffix] Sr.

ORA's name parser will produce accurate results in many cases, but cannot be expected to always be perfect. As can be seen from the description above, prefixes or pre-surnames that are not included on the lists, or suffixes not preceded by a comma, will produce incorrect results. Always review the output produced as you enter data into your genealogy program.

Names recorded with surname first will also produce incorrect results, but that can be corrected by use of the :nameToGivenFirst Transform described in the following section.

By default, the name part Transforms "standardize" the spelling, case and punctuation of prefixes and pre-surnames, and those suffixes that appear on the list of suffixes. For example, the suffix "Snr" is output as "Sr.", the pre-surname "Von" is output as "von", and the suffix "MD" is output as M.D." All of these "standardizations" can be changed by the user by editing the lists of prefixes, pre-surnames, and suffixes as described later in this section.

Certain standardizations can be controlled by checking or un-checkng the "Optional Name Part Characters" box at the bottom of the OraSettings page. By default, the periods after the prefixes Dr, Mr, and Mrs, and the suffixes Jr and Sr are included only if that box is checked.

If these "standardizations" are not desired, "raw" variations of each of the name part Transforms are available that output the parts exactly as they appear in record. They are namePrefixRaw, nameGivenRaw, nameSurnameRaw, and nameSuffixRaw.

If your research often involves names with prefixes, pre-surnames, or suffixes that are not on the default lists, or if you prefer different standardizations of them, you may choose to customize the lists rather than manually editing the default output. Do that from the OraSettings window by clicking one of the buttons in the Name Parts section near the bottom of that page, as shown below:

Here we see the three buttons that open the Prefixes, PreSurnames, and Suffixes list. Also shown is the check box for the Optional Name Part Characters. If we click, for example, the Name Prefixes button, the Edit Name Prefixes screen, as shown on the right, appears.

In the first column, labeled "Key," are the values the ORA name parser looks for to identify prefixes. Note that the Keys are not case-sensitive. That is, "dr" finds both "dr" and "Dr". It also finds "Dr." with a period.

In the second column, labeled "Value," are the values that the name parser outputs when it finds the text in the Key column. Note that some entries in that column have a character, in each case here a period, enclosed in square brackets. That character is only included in the output if the Optional Name Part Characters box, as shown in the illustration above, is checked.

To change the output for any Key, click the Edit button on that row and enter a new Value. To delete a Key and it's associated Value, click the Delete button on that row.

To enter a new Key and Value pair, click the Add button near the top of the screen.

The lists of PreSurnames and Suffixes work similarly. The only difference being that the Suffixes list is used only to control "standardization" of suffix output, since suffixes are identified by a preceding comma rather than their appearance on the list.

Surname First Names

Names may appear in a record with the surname first, rather than the usual order of given name, surname. Such names can be changed to given name, surname order with the nameToGivenFirst Transform. The result can be used directly, or can be followed with the name part Transforms described above to output individual parts of the name.

This Transform depends on the surname being followed by a comma, and will not work if the name recorded in the record does not have a comma after the surname.

Some examples of name Variables with this Transform alone, and followed by a name part Transform, are listed in the following table, where the Name field contains "La Jones, Rev Robert Q., Snr":

Variable and Transform Sample Output
[Name:nameToGivenFirst] Rev Robert Q. La Jones, Snr
[Name:nameToGivenFirst:nameGiven] Robert Q.
[Name:nameToGivenFirst:nameSuffix] Sr.

Separating Parts of a Field

In many records have multiple elements in a single field, for example a place field that contains city, county, and state, separated by commas. The split Transform can be used to output a single value from such a field.

The split Transform uses the term ":split" followed by two parameters. The first parameter tells the Transform which character separates the items, which are typically commas in place names, and spaces in the names of people, though others may occur. The second parameter is a number that tells the Transform which item in the field is desired. You can count from either the beginning or end of the field. Counting from the end is indicated by putting a minus sign in front of the number.

Some examples of place Variables with different split Transforms are listed in this table, where the Place field contains "Atlanta, Fulton, Georgia, USA":

Variable and Transform Sample Output
[Place:split:,:1] Atlanta
[Place:split:,:2] Fulton
[Place:split:,:-2] Georgia

In practice, separating parts of places, especially the town or city part, can be messy when records include ward numbers, districts, and other "place" names that may not be desired. My example Template article on Extracting City Names describes a more complex Template to address such cases.

Counting Parts in a Field

The previous section describes how to use the :split Transform to separate parts of a field that contains multiple parts, as often is found in fields for the names of people or places. A related Transform can count how many parts exist in such a field, which can be useful in constructing certain Templates.

Like other Transforms, it is entered in a Variable by adding the name of the Transform – ":splitCount" – after the name of the field. It requires one parameter; the character that separates the items in the field. An example of this Transform is this expression:

[Place:splitCount:,]

would produce these results:

Value in Field Template Output
Jackson, Shelby, Indiana 3
Shelby, Indiana 2

In practice I find this Transform most useful when combined with a Value Test Variable, as discussed in my article in Intermediate Template Methods. An example of this application is this expression:

<[?:Place:splitCount:,=2]city of [Place]>

would produce these results:

Value in Field Template Output
Jackson, Indiana city of Jackson, Indiana
Jackson, Shelby, Indiana {nothing}

In the first case in this example, the count equals two, so the Value Test returns true and the remainder of the expression, consisting of the literal text "city of " and the name of the place, are output. In the second case, the count equals three, which the Value Test returns false, so nothing is output.

Another example of the use of this Transform is in my Extracting City Names example Template.

Replacing Parts of Text

The fields ORA collects from the pages in a genealogical service sometimes contain characters we do not want to record in our data. For example, items in a field may be separated by underscore characters while we prefer to use a hyphen, comma, or space. Or a numeric field may contain leading zeros which we prefer to omit. The replace Transform can replace or remove undesired characters.

The replace Transform uses the term ":replace" followed by one required and two optional parameters. The first parameter tells the Transform what text is to be replaced. The second tell it what characters to replace that text with. If there is no second parameter the specified text is removed and nothing is put in its place.

This table provides some examples of the use of the replace Transform using just the first two parameters, which are sufficient in many cases:

Variable and Transform Value of Field Sample Output
[Film:replace:_:-] T625_375 T625-375
[Occupation:replace:_:, ] farmer_blacksmith farmer, blacksmith
[Roll:replace:^0+] 0360 360
[Name:replace:\b([a-z]) :$1. ] John F Jones John F. Jones

In the first two examples above the first parameter is ordinary text – the underscore character. There is a second parameter so the underscore is replaced with a hyphen in the first example and a comma followed by a space in the second.

In the third example the first parameter – " ^0+ " – is a "Regular Expression" that means "find the number zero as many times as it occurs at the beginning the text." In this example there is no second parameter, so the leading zeros are removed but not replaced.

The forth example also uses a Regular Expression for the first parameter – "\b([a-z]) " – that means "find a single-letter at the start of a 'word' followed by a space." The term "$1" in the second parameter means to replace the single letter found by itself, and then a period and space are added to it.

For more on using Regular Expressions see my article on Using Regular Expressions in ORA.

The third parameter, which is optional, may contain "Flags" that control the operation of the replace Transform. There are two Flags available:

If the third parameter is omitted the Transform assumes both are to be used, as if you included both. To omit both Flags, that is to search for only the first occurrence of the target string and to match only the case of that string as entered, place a space in the third parameter. To use one of the Flags but not the other, enter the one you want to use.

Extracting Parts of Text

The :extract Transform, like the :replace Transform described above, allows one to use parts of the values ORA collects from the pages in a genealogical service. But instead of replacing or deleting undesired contain characters this Transform focuses on the characters we want to save while excluding all others.

The extract Transform uses the term ":extract" followed by one required and one optional parameter. The first parameter tells the Transform what text is to be extracted.

This table provides some examples of the use of the extract Transform using just the first parameter:

Variable and Transform Value of Field Sample Output
[Age:extract:([0-9]+) age 23 23
[Film:extract:([a-z][0-9]+)] T625_375 T625
[Date:extract:\((.*)\)] 2 mars 1854 (2 Mar 1854) 2 Mar 1854

The extract Transform is useful only when a Regular Expression is used in the first parameter.

In the first example the parameter – "([0-9]+) " – means "find one or more numbers together, and capture the numbers found."

In the second example the parameter – "([a-z][0-9]+) " – means "find one letter followed by one or more numbers, and capture the letter and numbers."

In the third example the parameter – "\((*)\) " – means "find a left parenthesis followed any number of characters and then a right parenthesis, and capture characters between the parentheses."

As mentioned above, see my article on Using Regular Expressions in ORA for more information on how to work with Regular Expressions.

The second parameter, which is optional, may contain "Flags" as describe above for the replace Transform.

The extract Transform always extracts only the first match if the pattern specified in the first parameter matches more that one part of the value of the field. If there is more than one match and one other than the first is desired, use the :extractIndex Transform. It differs in that it has three parameters, with the second being a number that specifies which match is to be extracted.

State Names and Postal Codes

Genealogists generally avoid using postal codes for place names because they seem to change over time and may not be familiar to foreign readers. However they are handy to save space in applications that are not published, like file names for saved images and labels for source definitions that are used only internally by your genealogy program. Most records seem to express state and province names spelled out in full. ORA provides a handy Transform to output those names as postal codes, and the reverse.

The abbr Transform transforms a full state or province name to the postal code for it, and the full Transform does the reverse. They use the term ":abbr" and ":full" respectively, and have a single parameter, the name of the lookup table to be used. Tables available as of this writing are:

Table Description
au_states Australian States and Territories
ca_provinces Canadian Provinces and Territories
place A combination of the other geographical tables
us_states US States

Some examples of place Variables using abbr and full Transforms are listed in this table:

Variable and Transform Value of State Sample Output
[State:abbr:us_states] California CA
[State:full:us_states] CA California
[State:full:us_states] California {no output}

Note the last example – when the value in the field is not a value found in the table there is no output.

Changing the Case of Text

There are several Transforms that manipulate the case of text found in fields when you prefer it be presented differently in the output of your Template. They are:

This table provides some examples of the use of these Transforms:

Variable and Transform Value of City Sample Output
[City:capitalize] WILKES-BARRE Wilkes-Barre
[City:initialCapital] WILKES-BARRE Wilkes-barre
[City:lowercase] Wilkes-Barre wilkes-barre
[City:uppercase] Wilkes-Barre WILKES-BARRE

Changing Numbers to Words

Sometimes a numeric value is expressed in a document, or is extracted in the genealogy service record, as a numeral but we prefer, to record it spelled out in words. The numberToWords Transform changes numbers up to 999,999 into words. The numbers may contain commas between thousands groups and decimal points, but other punctuation formats are not supported. Numbers not supported are output unchanged. Non-numeric values result in no output

This table provides some examples of the use of these Transforms:

Variable and Transform Value of Age Sample Output
[Age:numberToWords] 23 twenty three
[Age:numberToWords] 23.5 twenty three and 50/100
[Age:numberToWords] twenty three {no output}

Using Multiple Transforms

The above sections describe how to use Transforms to modify the data collected by ORA before pasting it into your genealogy program. But sometimes you may need to make more than one Transform to obtain the result you want.

For example you may need to extract the name of a state from a field that also contains the names of the city and county, and then change the result to a postal code. You can do that by "chaining" Transforms together, one after the other. In this example, we show how to first extract the name of the state, then convert that to a postal code. In this example the Place field contains "Atlanta, Fulton, Georgia":

Variable and Transform Sample Output
[Place:split:,:-1] Georgia
[Place:split:,:-1:abbr:us_states] GA

As seen here, the Transforms must be listed in the order in which they are to be applied to the data. In this case the name of the state must be extracted from the field before it can be looked up in a table to find the postal code.

If any of the Transforms other than the last has optional parameters, they must be supplied. Otherwise the name of the next Transform will be interpreted as a parameter, which is unlikely to succeed.

Conclusions

ORA's Templates offer powerful tools to reduce the effort needed to transfer data from online genealogical services to your genealogy program. This article covers only the basics of employing these tools. For more information I suggest the following:

ReigelRidge Home Terry's Tips Home Contact Terry

 

 

Copyright 2000- by Terry Reigel