Text Scrape Processor
Text Scrape Processor – Converts semi-formatted text into an XML stream based on a set of content “markers”.
This processor allows you to take some text content like an e-mail or terminal screen scrape and assign start/end positions for tags – basically, it allows you to parse fixed-location content into XML. The editor is pretty easy to use to get up and running; just load in a formatted e-mail or similar and click substrings for start and end tags.
Processor (Adapter) Configuration Drop-Down List
Select the Text Scrape from the drop-down and click on Add Processor.
Click on Add Processor
Basic Text Scrape Processor Configuration Options
Basic Text Scrape Processor Configuration Options
You can see the Text Scraping Configuration window. Let’s load some sample data. In this example we use simple text file. Click File – Open Sample.
Select Open Sample from Text Scraper File Window
We use sample_mail.txt file in this example. If there is no such file when downloaded from the PilotFish site data or tutorial files, you can use any text file instead.
Click on Open Sample Button in Text Scraper Window
Now you can see text from the sample file in the Text Scraping Configuration window.
View the Sample Text in the Text Scraping Window
Let’s create start and end markers. Select Hello at the text, click Mark – Create Marker Start. Select Regards at the text, click Mark – Create Marker End.
Create the Start and End Markers in the Text Scraping Window
We created a simple Text Scraping Configuration. As the default output tag, we have Hello. If you want to change it, just input another tag name to the Tag column.
Review Output Tags in Text Scraping Example
Let’s look at how output XML looks like. Click the XML Output tab.
Select the XML Output Tab to View the XML Version of the Text
Also, we can look at the created configuration file. Click XML Config tab.
Then save the created configuration (File – Save Config) and close the Text scraping Configuration window.
Select the XML Config Tab to View the Created Configuration
Conditional Execution Text Scrape Processor Configuration Options
Conditional Execution Text Scrape Processor Configuration Options