Installing OGDI version 5 on Azure – Part 6

Posted by on May 5, 2013 in OGDI DataLab

Uploading Data in the Catalogue

In this part, we will be uploading data in the catalogue we have created and configured in the previous parts of this walkthrough. The OGDI DataLab v.5 solution has two projects that can help us with that – console application and Windows application. We will focus on the more visual and easy to use one – DataLoaderGuiApp.

To complete this step, you will need a CSV or a KML file that contains data for your catalogue. If you do not have one and you are experimenting with OGDI, you can always search the web for the terms “open data” and even include a city name. Most of the large municipalities already have their data published in plain files. An example for that is the Region of Peel in Ontario, Canada. Its data is here as of 26 June 2012. In our example we will use a CSV file with parking facilities in New York. We also suggest a smaller file so you can visually inspect it and make sure it is valid as well as smaller files will need less time to upload. You can always add data later. Using the same procedure we will describe below.
  1. Start the project DataLoaderGuiApp and you will see the OGDI Data Loader Windows application started

  1. First we need to configure the connection settings. So, click on the Settings tab and click the [Connection] button.
  2. In the Endpoint Setting form add your Account name of the data storage account and its secondary key in the Account key field. If you do not know what they were, login to your Windows Azure Platform management panel and look for the storage account with the word data at the end. You can also check part 1 of our walkthrough for details.

  1. Click [Save].
  2. Go back to the File tab and choose Open.
  3. Choose a CSV or KML file to upload but make sure it is really a comma separated value file and all rows have valid information in each column.
  4. Click [Open] to load the file in the application. It will not go in the storage yet.

Now, we need to configure the catalogue information in the OGDI Metadata Designer.

  1. The Dataset Metadata tab is mainly information you want to use to describe the catalogue. This is something the users will see and know what data you offer, how often you update it, and so on. Try to add descriptive information so you can attract more people to look at your catalogue.

  1. The next tab – Dataset Properties is more critical for the proper functioning of your catalog. On it you define the unique fields the catalogue will have so to distinguish each record. There are two fields in Windows Azure storage to do that – Dataset Primary Key and Dataset RowKey. You can use New.Guid, which means the storage will generate random and unique values for these two fields or you can choose fields from your CSV file that you know are unique and will be unique in the future. This is totally your call but we recommend you use the New.Guid feature and forget about saving few bytes of space per record.
  2. The Data Source Timezone again defines a reference to the location this data is published.

  1. Go to the Dataset Columns tab, where you can define the type of the data for each column as well as the two columns of geographic location data, if your file contains such.
  2. The OGDI Metadata Designer is smart and looks for columns that may contain Latitude and Longitude. If it finds them, you will see them selected in the two drop-down boxes on the right. If not, you can select them or leave the fields blank.
  3. Choose the Bing To Map checkbox if you have such geo-data to be identified as such in your catalogue. This will allow the DataBrowser we have installed in Part 4 to present each item of your catalogue on a map.
  4. In the Map Push Pin Text Formatting String text box you are able to add some of the data from the other columns of your catalogue on the pop-up box on the map. If you do not want to do that, skip to step 16.
  5. To do so, select identifiers for the columns in Push Pin Mapping column. These identifiers are fixed and there is a short explanation under the textbox.

  1. You can define extra information about the fields in the catalogue on the Dataset Columns Metadata but the application is quite smart to do it for you. We still suggest you add at least a better description of the fields in the Description column.

  1. Click [Save] when you are done and the application will save all the configuration data in a file next to the CSV or the KML file.
  2. Now, the OGDI Data Loader shows a line with some information about the data catalogue you just configured. You are ready to upload it. The important part is to select the Upload Method from the dropdown box and decide if you need to preserve the original data as well using the similarly named checkbox. In our case, we do not have any data in the catalogue, so we choose Create.

  1. Click [Start] and monitor the progress of the upload. Based on the size of your file and your internet connection, you may need to wait several minutes. In the log screen (the lower part of the form) you will see textual information about the upload and errors that may occur. Even you experience errors, you may have some of your data uploaded. The loader will ignore the bad records from your file and let you know about it on the log screen.

That is all!

Check to see your data by opening the webpages generated by the DataBrowser instance you installed on Windows Azure. The URL looks like this:

http://MyHostedServiceName.cloudapp.net

and you can also locate it in the Properties panel of your particular Hosted Service in your Windows Azure account. You created it in Part 1 of this walkthrough.

Good luck making the world open! And, please spread the news we exist!

< Previous page