Fast Data CleanUp Techniques

Using the FILTEX and SWAP functions

If you have a large database to "clean up" you could use some handy functions.

Microsoft FoxPro is a fast and effective tool for data cleanup.  You can work from the Command Window using the BROWSE command, to keep an eye on the data as you make the changes.

The most common operation is to make Global Replace changes, based on a condition.    This means you let FoxPro quickly change many records that you specify.   As the database gets cleaner and cleaner, you narrow down to a handful of changes that you may wish to make by hand.

SET FILTER TO ____ - to confine the changes to a subset of the data file

REPLACE ALL __ with ___ to make the changes. 

FILTEX Function

You've heard of the FILTER command and the INDEX command.  This is a combination of the two.  Even Fas FoxPro can bog down to a crawl with a filter set to a narrow condition on a file with many records.  This function uses the Filter clause of the INDEX ON command, to result in a very speedy BROWSE of the records you want to Swap.   Click here to see the program written in XBase for FoxPro.

This is because with BROWSE, FoxPro has to evaluate the condition for each record, every time you move or resize the window, which slows it down.  When an index with a filter is created, FoxPro only has to evaluate the condition once.  It knows the record numbers it should be working with, which makes it fast.  As records are cleaned and removed from the filter, they automatically are removed from view.

SWAP Function

Remember that if you REPLACE A WITH B, B WITH A will result in a duplicate of B in both fields.  So you should use the SWAP function.  Click here to see the program written in XBase for FoxPro.

The SWAP function first puts both values in memory variables, then does the REPLACE of fields with the memory variables, so you don't end up with two of one value.  They swap properly.

SWAP is much safer than a REPLACE, because if you swapped the wrong records, you can always just swap them back.  No values get lost, just switched in their fields.   SWAP "sweeps" through your data table cleaning as it goes.


Here is an illustration of the use of the FILTEX and SWAP functions in use.

wpe1C.jpg (193330 bytes)

Example - suggested series of steps

=filtex(['FAX' $ Home_Phone]) Set a fast filter to those Home_Phone records that contain 'FAX'
=swap('HOme_Phone', 'fax', .t.) Swap the fields, using memory variables between Home_Phone and Fax. 
The .t. optional parameter means to apply it to all records in the current Filter
=filtex() Passing no condition to the Filtex function turns off the filter.
This allows you to look at all the records and see if everything is ok
=filtex(['BUS' $ home_phone]) Next go for those Home_Phone numbers with 'BUS' for Business phone
=swap('Home_Phone', 'Phone') This wills swap the home and business phones
=filtex(['RES' $ Phone]) This will focus the filter on those business phones with 'RES' in it
=swap('Home_Phone', 'Phone') Same as the previous swap, basically
=filtex(['CAR' $ MobilPhone]) This was just to see how many MobilPhones had 'CAR' in it
which would be the proper field
=filtex(['CAR' $ Home_Phone]) But 'CAR' should not be in the Home_Phone field
=Swap('Home_Phone', 'MobilPhone') So we swap these CAR's in to the MobilPhone field