RX replace node

RX Replace Node

Overview
Settings
Advanced Settings
Examples
Scripting

Overview

Regular expressions are special text strings which are used to describe particular character patterns. The RX Replace node allows regular expressions to match those patterns within a string field and convert them to a different pattern. The replacement pattern can reference elements within the match pattern. The node creates a new field that contains the converted text.

The node uses the ICU Regular Expressions package. Full details can be found here.

Settings

Match field

This is used to select the string field containing the text that should be matched by the Pattern.

Prefix match field to field names

This specifies how the new field name should be generated:

  • when checked (the default setting), the new field name is generated by joining the name of the Match field to the Replace field name value
  • when unchecked, the new field name is the Replace field name value

Pattern

This defines the regular expression that will be matched against content of the Match field. Common regular expression components can be viewed and added by using the context menu in the Pattern text area.

Regular Expression Options…

These are described in Advanced Settings below.

Replace field name

This defines either the suffix which will be appended to the Match field or the full name of the new field, depending on the setting of Prefix match field to field names.

Replace pattern

This defines the regular expression that will be used to create the converted text in the output field. If the Pattern did not match anything within the Match field, the output field will be the same as the Match field. Common regular expression components can be viewed and added by using the context menu in the Replace pattern text area.

Replace mode

This defines how many matches to perform on the Match field value:

  • Replace all: (the default setting) match and replace all occurrences of the Pattern
  • Replace first occurrence only: match and replace only the first occurrence of the Pattern

Advanced Settings

These settings control the general behaviour of the regular expression matcher. The default is for all settings to be unchecked. These can generally be left in their default state.

Case insensitive

When checked, regular expression matching will ignore character case.

Multiline

By default, ^ and $ match the start and end of the input text. When checked, ^ and $ will also match the start and end of each line within the input text.

Match ‘.’ as line terminator

When checked, a . in a pattern will match a line terminator in the input text which by default it will not.

Comments in patterns

When checked, white space and #comments are allowed within regular expression patterns.

Use Unicode word boundaries

This controls the behaviour of \b in a pattern. When checked, word boundaries are found according to the definitions of word found in Unicode UAX 29.

Examples

All examples assume other settings are set to default.

Match and replace any non-numbers

This removes any non-numeric value from the input string.

Pattern[^0-9]

Replace patternempty

InputOutput
12341234
-12341234
1,2341234
(555) 123 456555123456

Mask IPv4 addresses (basic)

This masks out numeric IPv4 (Internet Protocol) addresses and replaces the numbers with underscore (_). Numeric IPv4 addresses have the form n.n.n.n where n is an integer in the range 0-255. For simplicity, the example matches against the . delimiter without checking the number of characters matched.

Pattern([0-9]*)\.([0-9]*)\.([0-9]*)\.([0-9]*)

Replace pattern_._._._

InputOutputNotes
127.0.0.1_._._._Valid IPv4 (matches)
127.1127.1Invalid IPv4 (no match)
127.0.0.1.12345_._._._.12345Matches and replaces the first 4 sections
12345.0.0.1_._._._Matches even though 12345 is not a valid IPv4 value

Mask IPv4 addresses (better)

This masks out numeric IPv4 addresses and replaces the numbers with underscore (_). Unlike the previous example, this also matches the number of numeric characters (1-3 characters only).

Pattern([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})

Replace pattern_._._._

InputOutputNotes
127.0.0.1_._._._Valid IPv4 (matches)
127.1127.1Invalid IPv4 (no match)
127.0.0.1.12345_._._._.12345Matches and replaces the first 4 sections
12345.0.0.112_._._._Matches since 345.0.0.1 does meet the match pattern

Mask IPv4 addresses (better still)

This masks out numeric IPv4 addresses and replaces the numbers with underscore (_). Like the previous example, this matches the number of numeric characters (1-3 characters only). However, it also requires that the first number is at the start of the string (specified with ^) and that last number is at the end of the string (specified with $). This also assumes that the input field should only contain IPv4 addresses.

Pattern^([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})$

Replace pattern_._._._

InputOutputNotes
127.0.0.1_._._._Valid IPv4 (matches)
127.1127.1Invalid IPv4 (no match)
127.0.0.1.12345127.0.0.1.12345Invalid IPv4 (no match)
12345.0.0.112345.0.0.1Invalid IPv4 (no match)

Scripting

Settings

Node type nameregexp_replace

SettingPropertyTypeComment
Match fieldmatch_fieldField
Prefix match field to field namesprefix_match_fieldBoolean
PatternpatternString
Replace field namereplace_field_nameString
Replace patternreplace_patternString
Replace modereplace_modeall or first
Case insensitiveopt_case_insensitiveBoolean
Multilineopt_multilineBoolean
Match ‘.’ as line terminatoropt_dotallBoolean
Comments in patternsopt_commentsBoolean
Use Unicode word boundariesopt_uword_boundariesBoolean

Scripting Example

node = modeler.script.stream().createAt("regexp_replace", u"RX Replace", 512, 192)
node.setPropertyValue("match_field", u"IPv4")
node.setPropertyValue("prefix_match_field", False)
node.setPropertyValue("pattern", u"([0-9]*)\.([0-9]*)\.([0-9]*)\.([0-9]*)")
node.setPropertyValue("replace_field_name", u"MaskedIP")
node.setPropertyValue("replace_pattern", u"_._._._")
node.setPropertyValue("replace_mode", u"first")
Download your free copy of our Understanding Significance Testing white paper
Subscribe to our email newsletter today to receive updates on the latest news, tutorials and events, and get your free copy of our latest white paper.
We respect your privacy. Your information is safe and will never be shared.
Don't miss out. Subscribe today.
×
×
WordPress Popup Plugin
Scroll to Top