Name: Regular Expressions for IBM SPSS Modeler
SKU: 12688
Price: 199.00 GBP
Availability: InStock

Regular expressions are strings used to describe particular character patterns. These expressions can be used to match and group text fragments, search for patterns and replace them, or split text into multiple pieces. Example uses include:

Extracting components from log files such as time, severity and descriptive text
Converting different phone number styles into a standard format
Splitting URLs (web page addresses) into the individual components that make up each address

Regular Expressions for IBM SPSS Modeler is a set of new nodes that adds the power of regular expressions to IBM SPSS Modeler. The new nodes are:

RX Groups: this node matches specific items in a string which then are extracted into new output fields
RX Split: this node splits a string into separate components using a specified delimiter which are then added to new output fields
RX Replace: this node matches patterns within a string field and converts them to a different pattern which is added to a new output field
String Cleaner: this node provides common string cleaning operations (e.g. removing duplicate whitespace or non-printing characters) across multiple input fields in a way that is simple to use

Scalable

Regular Expressions for IBM SPSS Modeler is scalable to millions of data rows. Unlike some approaches which use temporary files, these new nodes process records “in-line” i.e. one at a time in memory. This massively increases processing speed while keeping memory requirements low and removing the need for additional temporary disk space.

Scriptable

The nodes are fully configurable using the Python scripting language provided by IBM SPSS Modeler. Online documentation for each node includes its scripting reference.

System Requirements

IBM SPSS Modeler 17.x or 18.x
Windows 7, 8 or 10 64-bit operating system

Trademarks

IBM and SPSS are registered trademarks of International Business Machines Corp.