Using iMacros in Skeptical Activism

I wanted to grab particular details from the Australian Traditional-Medicine Society’s practitioner search results, and subsequently use the those details (website URLs) in a Custom Google Search.

This particular web-based database allowed me to list all their practitioners without entering a refining search term or location. To scrape the information I used a Firefox Add-on called “iMacros” from Opus using “Relative Positioning”.  The Help Section is a bit messed up, but if you find your way to their Wiki, you’ll be fine.

You can get iMacro from Opus at: iMacros for Firefox 7.2.2.0 by iOpus

Automate Firefox. Record and replay repetitious work. If you love the Firefox web browser, but are tired of repetitive tasks like visiting the same sites every days, filling out forms, and remembering passwords, then iMacros for Firefox is the solution you’ve been dreaming of! ***Whatever you do with Firefox, iMacros can automate it.***

The Hard Working Script

The first line of code (after “settings”)

TAG POS=1 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

The script searches for the text “Website” (with a particular set of text attributes).

Each instance this occurs on a webpage is assigned a different position number.
The first instance is given the “TAG” of position 1 (POS=1). (REMEMBER THIS)

The set of attributes defined included a NOWRAP, an ALIGN to the right and the CLASS of text as “searchResultsLabels”
How attribute definitions work may become clearer in the next line of code.

The second line

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

The script then seeks out and extracts the data located the First Relative Position (POS=R1) to the TAG. (Being POS=1)

The Relative Positions are searched for and assigned by the script based on the following information:

The TYPE defines the type of entity I wanted to the script look for, in this case an <a> TAG.
The ATTR defined that the tag should have (*) Anything as the HREF, and (*) Anything as the TXT

Basically, the script looks for the first link after POS=1 (which was defined as “Website”).

The final part “EXTRACT=TXT” scrapes the TXT portion of the link (I used this because it was exactly the same as the HREF) and stores it in memory.

Third Line

SAVEAS TYPE=EXTRACT FOLDER=C:\DocumentsandSettings\SkepticTools\Desktop FILE=ATMSsearch

The SAVEAS command takes what is in the memory (EXTRACT) and writes an output file.

This particular line means that each time a link was found it would write it to the output file.
The output file would be ADDED to, NOT replaced.

The example above is the WINDOWS version of the SAVEAS command.

FOLDER defines the Folder Location. Currently, it saves it to the Desktop of the user named “SkepticTools”.
Spaces must be defined using tags.

FILE defines the FILENAME, Using a (*) wildcard uses the default output file “results”. (filetype not defined)
In this instance, “ATMSsearch” is the filename (without the extension .txt included)

The MAC Version would use: FOLDER=/Users/SkepticTools

Making it Loop

The previous lines only perform the search and write function ONCE, but I needed the script to look for “Website” up to ten times on each page.

Rather than play it as a loop, I opted to have the script seek out 10 instances of “Website” and extract what it could.

On the first page, there were only 3 instances of website, but I as an example I have the script run 4 searches.

To do this, I had to copy and paste lines 1-3 modify line 1 slightly in each instance.
You will notice the Position is increased each time.
You may remember earlier I explained that EACH INSTANCE is given a different position number.

So, the script looks for: instance 1, instance 2, instance 3, instance 4, etc.

TAG POS=1 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=C:\DocumentsandSettings\SkepticTools\Desktop FILE=ATMSsearch

TAG POS=2 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=C:\DocumentsandSettings\SkepticTools\Desktop FILE=ATMSsearch

TAG POS=3 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=C:\DocumentsandSettings\SkepticTools\Desktop FILE=ATMSsearch

TAG POS=4 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=C:\DocumentsandSettings\SkepticTools\Desktop FILE=ATMSsearch

This I hope makes the relative positioning clearer. the Relative Position is in RELATION to the position defined immediate preceding it.

Going to the Next Page

After you have the script running the 10, 20, 25, 50 times you may want it to run on each page, you still need to go to the NEXT page. For my macro, I used:

TAG POS=1 TYPE=IMG ATTR=BORDER:0&&SRC:/images/record_next.gif&&TXT:

The script searches for and TAGs the first instance of “/images/record_next.gif”.
Because in my example the image was also a link, the script will automatically and follow the link.

This should be the end of your script (for one page).
To make it loop, run the script in LOOP Mode.

Settings

My settings were as follows:

VERSION BUILD=7220523 RECORDER=FX

TAB T=1

The above is the standard information for the macro.

SET !ERRORIGNORE YES

SET !TIMEOUT_TAG 1

SET !TIMEOUT_STEP 1

The above are settings to ignore errors and reduce the time-outs to 1 second (default is 5),

SET !EXTRACT_TEST_POPUP NO

is used to turn off the popups that occur during testing.
I highly recommend removing this line until you are happy that the script works correctly.

TAG POS=1 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=C:\DocumentsandSettings\SkepticTools\Desktop FILE=ATMSsearch

TAG POS=2 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=C:\DocumentsandSettings\SkepticTools\Desktop FILE=ATMSsearch

TAG POS=3 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=C:\DocumentsandSettings\SkepticTools\Desktop FILE=ATMSsearch

TAG POS=4 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=C:\DocumentsandSettings\SkepticTools\Desktop FILE=ATMSsearch

TAG POS=1 TYPE=IMG ATTR=BORDER:0&&SRC:/images/record_next.gif&&TXT:

The iMacros Code for MAC

VERSION BUILD=7220523 RECORDER=FX

TAB T=1

SET !ERRORIGNORE YES

SET !TIMEOUT_TAG 1

SET !EXTRACT_TEST_POPUP NO

TAG POS=1 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=2 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=3 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=4 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=5 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=6 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=7 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=8 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=9 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=10 TYPE=TD ATTR=NOWRAP:&&ALIGN:right&&CLASS:searchResultsLabels&&TXT:Website

TAG POS=R1 TYPE=A ATTR=HREF:*&&TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=/Users/bayani FILE=*

TAG POS=1 TYPE=IMG ATTR=BORDER:0&&SRC:/images/record_next.gif&&TXT:

If you’ve got any problems, I may be able to assist. :)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.