dinsdag 27 juli 2010

Geocoding text based spatial data wih SSIS packages.

On SQL Server Performance site I found an interesting example of geocoding some spatial data of siddharth Mehta. In his example he decodes some famous places around the world into a latitude and a longitude. I decided to try this and to learn more about geocoding. Below you can see the result.

These are the steps you have to take for this small demo:
1. Create testdata.
2. Use geocode processing engine (Bing or Google).
3. Built a geocode webservice wrapper class.
4. Built a SSIS package
5. Write the results in an table.

1. Create the testdata

In the example of siddharth Mehta he used some objects from around the world. I decided to add some places and tried some tests with it. Below you can see the starting point i used.


2. Use a geocode processing engine

We need to use bing maps or google maps to process the information, as we provide in the example, which can take text based location information as an input and return the latitude and longitude of the corresponding location. These web based applications expose web-services which can be consumed by applications, our purpose in this case is to geocode the text based data we have in our "Address" table.
For the purposes of this sample application, we will use the webservice exposed by Bings Maps. First we will need to subscribe to the service. Navigate to www.bingmapsportal.com, sign up and create a key which will be your gateway to the webservices exposed by Bing Maps. In the below screenshot, you can see that I have created my application-key pair.


3. Built a geocode webservice wrapper class

As the article states, create a client or wrapperclass which can be used in the SSIS package. This way it’s easy to use a webservice. The .NET framework ships with the wsdl.exe tool. This tool creates a proxy class. Installing the .NET framework was not enough in my case.  Even when i installed the SDK for .NET framework icouldn't find the wsdl tool. I had to install visual studio express before the wsdl tool was installed. May be I missed something.

Do this:
wsdl /out:C:/geocode.cs http://dev.virtualearth.net/webservices/v1/geocodeservice/geocodeservice.svc?wsdl

This will create a geocoding class in C#. This will be used in the SSIS Spackage.


4. Built a SSIS package

The next step is building a SSIS package.There are four steps in this package:
a. Read the textual spatial information
b. Add two columns: Longitude and latitude
c. Retrieve the information with the proxy class and bing webservice
d. Create a destination.

a. Read the textual spatial information.
  • Create a SSIS package and give it a name.
  • Create an OLEDB connection to the Address table.
  • Add a Dataflow to the package and configure it to read the address table 
b. Add two columns: Longitude and latitude
  • Create a derived column task.
  • Add two fields latitude and longitude
  • Use the unicode textstring and set the length to 100



c. Retrieve the information with the proxy class and bing webservice.


  • add a script transformation task
  • Select Type transformation.
  • Select scripting language visual C#.
  • Edit the lattitude and logitude columns and make them writeable.
  • In the project explorer choose "add existing item".
  • Add references System.Web.Services and System.XML to the project.
  • Select the proxy class Geocode.cs which you created with WDSL tool.
  • Add the following code snippit in the Script task (Input0_ProcessInputRow)
{
string latitude = "", longitude = "";
GeocodeAddress(Row.AddressObject + ", " + Row.AddressCountry + ", " + Row.AddressCity, ref latitude, ref longitude);

Row.Latitude = latitude;
Row.Longitude = longitude;
}

private void GeocodeAddress(string address, ref string latitude, ref string longitude)
{

string key = @"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";

GeocodeRequest geocodeRequest = new GeocodeRequest();

// Set the credentials using a valid Bing Maps key

geocodeRequest.Credentials = new Credentials();
geocodeRequest.Credentials.ApplicationId = key;

// Set the full address query
geocodeRequest.Query = address;

// Set the options to only return high confidence results
ConfidenceFilter[] filters = new ConfidenceFilter[1];
filters[0] = new ConfidenceFilter();
filters[0].MinimumConfidence = Confidence.High;
// Add the filters to the options
GeocodeOptions geocodeOptions = new GeocodeOptions();
geocodeOptions.Filters = filters;
geocodeRequest.Options = geocodeOptions;

// Make the geocode request
GeocodeService geocodeService = new GeocodeService();
GeocodeResponse geocodeResponse = geocodeService.Geocode(geocodeRequest);

if (geocodeResponse.Results.Length > 0)
{

latitude = geocodeResponse.Results[0].Locations[0].Latitude.ToString();
longitude = geocodeResponse.Results[0].Locations[0].Longitude.ToString();

}
}

  • Replace xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx with the key which gave Bing in the BingMapsPortal.
  • Not sure if this is needed, but do a build.
d. Create a destination
  • Write the results in a table with Destination


5. Write the results in a table.


And here are the results:

Conclusion:

So this example shows the transformation of turning the textual spatial information into a spatial information. I tried some some small places and they all got a latitude and longitude. This way you can transform spatial information in geocoded information. Interesting and i will try more of this stuff and blog abou it.


Hennie

3 opmerkingen:

  1. Hi Hennie,
    thank you very much for your post, it's been very useful and the key to succeed.

    best regards and my best wishes for the new year 2011

    xavier.garcia.sabater@gmail.com
    free lance bi consultant

    BeantwoordenVerwijderen
  2. Hi Xavier,

    Thanx for your reply.

    Tried to look at your blog but no luck.

    Hennie

    BeantwoordenVerwijderen
  3. It's possible to run the SSIS package when the input is geocode information? i mean the latitude and longitude values are the input values and the output are the addresses?

    BeantwoordenVerwijderen