Skip to content

Using the SSIS Scripting Task to Geocode Addresses

2012/03/27

In February I was lucky enough to see Brian Larson give a talk on GeoSpatial visualization using SSRS at the PASSMN meeting. He showed us how to use the Map Wizard in SSRS to quickly create some cool reports. As he was looking up the latitude and longitude for the addresses I remembered a project I had worked on using SSIS and Google Maps to geocode data. For the March PASSMN Meeting I agreed to share some information on the Scripting Task in SSIS and I decided to use the geocode example. Since I was giving this presentation in a Microsoft building I changed from Google to Bing.  Leave out having to learn the Bing API in a couple of days the package for doing this is pretty straight forward. A SQL Task to get the address information is passed to a For Each Loop that runs the script for each address. The Loop updates the Lat and Long and then a final SQL task to create the geospatial points in the database.

For the address data I have it stored in a table in the SQLMD database called Address. To pass the data to the Bing API I only need a single string that includes the address, but after reading six chapters in SQL Server MVP Deep Dives 2 on Database Architecture I couldn’t bring myself to create a single column table.

The first Task in the SSIS package reads in the Addresses as a single string where the latitude or longitude are null. (No sense geocoding if we already have the data.)

SQL Statement:
SELECT isnull([address1],'') + isnull(' ' + [address2] + ', ','') + 
isnull([city] + ', ','') + isnull([state] + ', ','') + isnull([zip] + ', ','') 
as Address,AddressID FROM [SQLMD].[dbo].[address] where lat is null or long is null

I’ve set up the following variables in the the package.

The result set for this first Execute SQL Task is sent to the Addresses object variable. This gives us a record set to pass to the For Each Loop. The For Each Loop reads through each record in the Addresses recordset and sends the values of Address and AddressID to the SSIS variables of the same name. 

The script runs for each record and the Address is passed in as a read only variable. Lat and Long are passed in as read/write variables. The Sub Main can be replaced with the script here (vb|c#), but first the web reference for the Bing GeoCodeService needs to be added. The web reference is added by right clicking the project in the solution explorer and selecting Add Web Reference (or you can add by going to the Project properties tab.)

The GeoCodeService is found at http://dev.virtualearth.net/webservices/v1/metadata/geocodeservice/geocodeservice.wsdl.  Enter this URL and the service should come back as net.virtualearth.dev. I changed the name to bing.geocode before I added the reference just to make my code more readable. The documentation on how to use it can be found MSDN Library. Giving a tutorial on the GeoCode API is beyond what I want to do here. I am hoping to show a few examples of using the scripting task for tasks that the out of the box tasks can’t do. This is the first post as a part of that effort. To use the Bing API you will need a key to replace the “[Get Key from Bing API page]” string in the sample code. The Bing API code can be obtained by going to the Getting a Bing Maps Key page.

Once you have the web service added you can call all of the needed classes and methods to geocode an address. For my script I set it up to check for the SSIS variables using the DTS.Variables collection

 If Dts.Variables.Contains("Address")...

Before I started adding this condition to my scripts I would forget to add the variables about half the time. By adding this logic into the script I now almost always remember to set up the variables. This makes it a little redundant, but it is worth the time not to get the error pop up when I hit run.

I wrap the call to the GeoCodeservice in a Try…Catch so that if I don’t get back a response from the web service for any reason the package will not error out and it will keep looping through the addresses.

Lastly in the script I made use of the FireInformation and FireError DTS Evenets to help with debugging.

After the script runs The Lat and Long are passed out to SSIS variables of the same name. These are used as parameters in the Execute SQL task to update the address table.

update address
set lat= cast(? as float), long = cast(? as float) 
where addressid = ? 

Once all of the records have looped through the script there is one last Execute SQL task that updates geocodedata.

UPDATE address 
SET [Geocodedata] = geography::Point([Lat], [Long], 4326)

After running the package I used the SSRS Map wizard to pull in the geospatial data and create a map of my points.

There are other ways to geocode data in a SQL database and a batch based solution would probably be more practical, but this works well for small records and it was fun to put together.

Advertisements

From → SQL Server

2 Comments
  1. Can you please provide the SSIS package?

    • SQLMD permalink

      I haven’t used this in quite some time. I would have to rebuild it from scratch. It would probably be easier for you to just do that on your end than to try and worry about versions and getting the package on the site. Sorry.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s