Modelling the holiday-based redistribution of South Africans in December

MAP SERIES

Over the coming months, I’m planning on developing a map series to showcase often overlooked aspects of Cartography and GIS. The idea is to explore topical subject matter to create insightful and (hopefully) beautiful maps.

This is the first in the series.

OLYMPUS DIGITAL CAMERA

Every December hundreds of thousands of South African holiday-makers push pause on their lives and scatter across the country; making time to explore, relax and unwind.

I got to wondering if there would be a simple way of modelling this behaviour. Surely there must be some universal underlying factors that could be used to help explain where people go in December? I also knew I wanted to represent my data in a non-traditional way.

For the sake of simplicity, I limited my sights on South Africans moving within South Africa for the holiday season and eventually settled on four broad factors to consider:

  • F1 [-] Distribution of population during the rest of the year
  • F2 [+] Accessibility (using major roads as a proxy)
  • F3 [+] Distribution of holiday accommodation
  • F4 [+] Distribution of National Parks

There are obviously many more factors at play however these four seemed to interact spatially in a dynamic enough way across the country that I was happy to move forward with my investigation.

The density per factor was calculated per municipality, normalised across the country and combined into an equation that attempts to model the interaction between these factors as a linear function.

equation

In the formula, population density acts as a push factor – people will be moving away from areas of high population density towards areas with low population density. The availability of accommodation, how accessible the area is and the distribution of national parks all act as pull factors.

The amount that each factor contributes towards the final index is controlled with weights and the global difference within each variable is exaggerated by squaring it’s normalised value to highlight the most favourable areas more clearly.

The final index can be used to rank order each municipality based on the likelihood that it will be visited in December by people who do not live in that region.

These values were then used to generate the following cartogram:

dec_mapseries_cartogram_screenshot

  • You can explore the map right down to the municipal level
  • The shades of blue represent the percentage change in surface area relative to the region’s usual size. This is affected by the rank as well as the relative difference in the ranks surrounding the area.
  • National parks are included as well as major cities as you zoom in for context
  • The top 20 sites are highlighted with the concentrically banded points
  • Clicking anywhere on the map will return the overall rank for that region

Cartograms have been around since the 1800s. They provide us with a new perspective to our world by taking a thematic variable and typically substituting it for the area of the land that it represents.

The creation of cartograms comes with several challenges as regions must be scaled and still fit together. A recent (2004) and popular method of generating contiguous cartograms is the Gastner-Newman Method. This method is faster, conceptually simpler to understand and produces easily readable cartograms. The algorithm guarantees topology and general shape preservation (albeit with some distortion). This method allows its users to choose their own balance between good density equalization and low distortion of map regions, making it flexible for a wide variety of applications.

Now I need YOUR help.

Taking this one step further, I’ve configured a crowd sourcing web application which will allow users to post about their holiday destinations in a collaborative manner.

You will be able to access this from anywhere on any device and see information contributed by all users of the application. My hope with this is that this information will further support the outcome of the formula and cartogram produced in this exercise.

destinationwhere

Please share far and wide and happy holidays!

How to save over 70GB of hard drive space in one click!

Drives

Recently I found myself wondering where exactly all the space on my hard drive was going. One day it was there, and the next it was gone.

I did my usual Windows clean-up but still wasn’t happy with the outcome so I did a bit more exploring into the Esri side of things to see what could be done. And the answer, quite simply is, A LOT, with absolute minimal effort!

Today I am going to introduce you to a lesser known tool from the Data Management Toolbox (and definitely finding its way into my Top 10) called Compact.

The tool does what the name implies, specifically for file (and personal) geodatabases which we all characteristically have scattered across our hard drives.

The underlying architecture of these types of geodatabases relies on binary files – as you add, remove and edit data within the geodatabase these files become fragmented which ultimately decreases the performance of your database and takes up wasted space.

What compact does is rearrange how these files are stored on your disk, reducing the overall size and improving overall performance. WIN-WIN!

To explore just how much a difference this could possibly make, I wrote a script that could iterate through all of the directories on my computer, searching for these geodatabases to perform a compact operation on them. If you’re working with a specific feature class or a database is locked for whatever reason, the script will gracefully skip over it and continue on its hunt for free space in your directories. Your overall savings may vary based on the type of work you’re doing with your databases on a day-to-day basis, I personally saw a total of 70 GIGABYTES of data released back into the system. That’s a lot of 0s and 1s.

Geodatabase Compactor

I’ve made the script into a geoprocessing tool which you can download here. If you’re the more inquisitive type, you can right click on the tool in a Catalog window and click Edit to see the nuts and bolts – it’s a very good example of Python’s os.walk function to step through files and directories.

You can choose the nuclear option like I did, and scan an entire drive, or choose a specific directory for it to iterate through.

If you have background geoprocessing enabled, progress messages will be logged to the Results Window.

Depending on the number of geodatabases you have on your PC, the first run of the tool may take some time. Subsequent runs will be faster as your databases will already be optimised.

Happy space saving!

Determining Solar Potential for Rooftops of Multipatch Feature Types

Part of the Modelling Reality in 3D series

Often times there are problems that simply have to be solved in 3 dimensions in order to attain the appropriate results. This doesn’t have to be scary though! Through this series of blog posts – Modelling Reality in 3D, we’re going to uncover some simple and practical uses for 3D GIS.

In this demo we’ll be using tools that are nestled away in the Spatial Analyst extension and often overlooked in order to determine the production potential of rooftops of multipatch feature classes (Esri’s geometry type for 3D features) for generating electricity harnessing the power of the sun!

For this exercise we’ll be using a multipatch feature class from HERE’s 3D Landmark dataset of the Dome in Northgate, Randburg as its construction lends itself quite nicely to an exercise of this kind. This workflow should be perfectly acceptable to use on any other multipatches with a ‘roof’ area with minimal tweaking to the model as long as you keep in mind that this model assumes that skyward facing portions of the multipatch are rooftop areas.

dome1

The high level workflow and tools used for this exercise are as follows:

dome7

A toolbox can be downloaded HERE in which you can delve further into the parameters set for this demo. We will be discussing it on a conceptual level on the blog.

 Prepare Usable Roof Area

This model will be calculating the maximum potential that can be harnessed by a rooftop, therefore we need to define what this region is. The Area Solar Radiation tool, which we’ll discuss later on, requires a DEM as input and provides results based on a square metre, so we know that this rooftop needs to be represented as a DEM and to make calculations easier later on we will be using 1 metre squared pixels.

dome2

Using the Slope and Raster Calculator tools from the Spatial Analyst Extension we extract all of the areas with a slope of 36 degrees or less – this gives us a good approximation of the rooftop area that could hold a photovoltaic cell – we then use a number of other raster-based tools from this extension to clean up the roof area we will be working with.

dome3

Using the rooftop area we then extract from the DEM of the building only the portions of the DEM that relate to the roof area that we require for our analysis.

dome4

Calculate Global Solar Radiation

Using the Area Solar Radiation tool we determine the global radiation expected to hit the roof of this building in an entire year – this is a combination of both the direct and diffuse radiation and the pixel values have the unit of watt-hour per square metre. In this exercise I used all of the default values as they were well suited for the area in which this building lies, however you can change a number of parameters related to the amount of light that would eventually reach your rooftop.

dome5

Additional outputs include views of both the direct and diffuse radiation which make up the global radiation as seen above as well as a DirectDuration ‘map’ which indicates in hours the amount of time each pixel would receive direct solar radiation.

dome6

Prepare Basic Contextual Statistics

Now that we have a result, we need to make sense of it and often times the best way to about this is by providing context. The following statistics were calculated based on the global solar radiation values.

Statistic

Result

Assumption

Total Global Radiation

3 192 297 067 wH

Conditions modelled in the Area Solar Radiation Tool are correctly indicative of an average year for the site.

Total Area

21 189 m2

Solar Electricity Potential

3 192 297 kWh

Largest Possible System Cost

R35 455 711

Based on a solar panel with the following specifications:

Module Output: 310W

Cost: R3246.86 per unit

Size: 1.940352 m2

http://www.sustainable.co.za/jinko-jkm310p-310w-solar-panel-pallet-of-28.html

Largest Possible System Size

3 385 200 kW

Solar System Potential

4 077 473 kWh/year

Based on a running time of 5 hours of maximum output for the largest possible system every day for a year with loss factors accounting for temperature (6%), dust (7%), wiring (5%) and DC/AC conversion (20%)

Number of households that could be powered per month, either:

Low Consumption

680

500kWh per month

Medium Consumption

227

1500 kWh per month

High Consumption

113

3000 kWh per month

Conclusion

Obviously this approach is based on a number of assumptions which would be made clearer on a true project of this nature and scale. A number of factors have also been disregarded such as the weight of the system and how much load the roof structure could bare. What this model does do is quickly provide an indication of the potential of rooftop-based solar energy in South Africa and hopefully showcases both the power of tools within our software within a 3D context!

Migrating Python Scripts to ArcGIS Pro

What’s new?

MainImage

With the migration towards 64-bit processing in ArcGIS Pro, some big changes have come to the Python environment as well.

  1. Python in ArcGIS Pro has been upgraded to version 3.4. All other ArcGIS products are still using version 2.7. Both versions of Python are being developed in parallel and share much the same functionality.
  2. Changes have been made to the functionality within the arcpy site package. This includes the dropping of some functionality and the augmentation of others, e.g. arcpy.mapping has been replaced with arcpy.mp in ArcGIS Pro to support ArcGIS Pro’s mapping workflows. For a detailed overview of changes consult the following page.

Assessing the situation

ArcGIS Pro comes with a geoprocessing tool called Analyze Tools for Pro (Data Management Tools > General). This uses the Python utility 2to3 to identify issues when migrating a script and even goes so far as to identify functionality that has not been migrated to ArcGIS Pro. Running this tool will generate an output for you that will state which lines have errors and suggest appropriate changes which you can manually go through and assess. Often the required changes are small and can be make quickly without automation.

Converting your scripts

Sometimes a script is simply too big to go through manually. Thankfully Python 3 comes with a tool to help automate the conversion process.

NOTE: The following steps make changes to the input script. We recommend making a copy of the script being converted and appending 34 to the end of it and making changes to this version of your script so that you leave the original intact. E.g. script.py -> script34.py. If you choose not to do this, don’t worry, the script creates a copy of the file in the directory with a .bak extension to ensure the original script is preserved.

  1. Run Command Prompt as an Administrator
  2. Type in the following:

PythonInPro34

where C:\script34.py is the path to the script you want to convert

  1. Once done, the script should have most of the changes done required to make a script functional using Python 3 in ArcGIS Pro. We say most because some functionality within Python could potentially have moved, been renamed or replaced which would require manual intervention on your behalf where 2to3 utility could not make the required changes.

Writing scripts to work in both Python 2 and Python 3

The following tips will help greatly in ensuring a Python script will work in both ArcMap and ArcGIS Pro as long as the tools referenced in the script are available in the ArcGIS Pro version of arcpy. By making these practices a habit when scripting in the Python 2 environment, you will greatly ease the transition into the world of Python 3.

  • Tip 1:

Adding the following line to the top of your script will import some of the new rules enforced in Python 3 to your Python 2 script:

PythonInPro34_2

print_function

The print statement has been replaced with the print function. This function is also available in Python 2 and by using it you ensure your scripts will work in both environments.

PythonInPro34_4.png

 division

Python 3 handles division of integers in a more natural line of thinking. This is one of my favourite new things as it makes things far less confusing for people just starting out with Python.

Python 2: 3/2 = 1
Python 3: 3/2 = 1.5

If you find you need to use the old truncating division, you can simply use ‘//’ instead of ‘/’.

 absolute_import

The behavior in Python 3 means that by default top level imports are honoured. Lower level imports need to be explicitly stated. This relates to complex scripts referencing other scripts and for the most part will not affect the majority of users. For more on the implications of this click here.

unicode_literals

String literals are unicode on Python 3 and making them unicode on Python 2 leads to more consistency of your string types across the two runtimes. This can make it easier to understand and debug your code!

Basically “Some string” in Python 3 is now equivalent to u”Some string” in Python 2.

If you want to use 8-bit strings like the default in Python 2, simply place a ‘b’ in front of it and you’re good to go.

  • Tip 2:

Import known modules with changes in the following fashion to ensure that the required functionality will be available within your script:

PythonInPro34_3

Cheat sheet to changes in Python 3

  • Adding the following line at the top of your script will enforce encoding within your script in Python 3 as it’s parsed to utf-8: PythonInPro34_5

You no longer have to cast to string in Python 3 – anything within quotation marks will explicitly be treated as an encoded string of the document’s encoding type!

  • Exceptions are no longer iterable, you are required to use the exception attribute args to print messages:

PythonInPro34_6

  • int and long types have been merged. Before, one could simply write 5L and the 5 would be a long integer, now this will give you a syntax error. If you explicitly need to set a long integer the following approach is required:

PythonInPro34_7

One of the foundations of the ArcGIS Platform is the concept of extensibility – the ability to allow users to extend the functionality of the software beyond it’s out-of-the box processing capabilities to suit the required workflow. The Python scripting language lends itself very effectively to this end. Using some of the tips outlined in this post you’ll be well on your way towards producing adaptable Python scripts that speak to the needs of users within multiple environments.

Happy scripting!

Tips and Tricks for Geocoding in ArcGIS Online

Tips & Tricks for Geocoding in ArcGIS Online

Placing an address on a map either to find or place or to provide business context is becoming vitally important in our society. Location matters.

Most commonly address information that is stored in a database is not something that is regularly maintained. Often information is captured in free text fields which results in data irregularities and inconsistencies.

The purpose of this article is to provide some insight into how to better manage an address dataset which would potentially be batch geocoded and how to optimise the capturing of these address datasets for geocoding in ArcGIS Online.

There are a number of variables at play which can affect the final outcome of a geocoding exercise (the most pivotal being the quality and accuracy of the reference data you are matching against) and it is never as simple as receiving an address dataset and geocoding it, often times clients want quantifiable measures of accuracy for the geocoded dataset and the GIS personnel working on the project are often expected to clean and normalise addresses in order to improve match rates.

Here are a few helpful tips which will help ensure accurate geocodes when using the World geocoder in ArcGIS Online.

Helpful Tips

  1. Use single-line addresses

Geocoding single-line addresses is both faster and often more accurate than feeding the address records to the geocoder field by field. This is for a number of reasons, the most obvious being that often the incorrect information is captured in the wrong field.

  1. An address should look like an address

The ArcGIS Online geocoder uses a form of programmatic pattern matching. If an address does not match the patterns in the locator, your geocodes suffer.

Best practice is to ensure your addresses look as follows:

Normal Address:

[HOUSE NUMBER] [ ] [STREET NAME] [ ] [STREET TYPE] [, ] [SUBURB] [, ] [CITY] [, ] [PROVINCE] [, ] [POSTAL CODE]

Corner Address:

[CORNER OF] [ ] [STREET NAME] [ ] [STREET TYPE] [ ] [AND] [STREET NAME] [STREET TYPE] [, ] [SUBURB] [, ] [CITY] [, ] [PROVINCE] [, ] [POSTAL CODE]

POI Address:

[POI] [, ] [SUBURB] [, ] [CITY] [, ] [PROVINCE] [, ] [POSTAL CODE]

  1. A city is more important than a suburb

Suburbs in South Africa remain loosely defined and differ from dataset to dataset. The inclusion of extensions creates an additional host of problems and often suburb names change, or an individual may say their street falls in a neighbouring suburb for various reasons. You are more likely to get an accurate geocode using a city alone instead of using a suburb which does not match the suburb in the reference data you’re matching against.

  1. Never trust a postal code

Many people do not even know their postal code and it does more harm than good by including an incorrect postal code in an address for geocoding in ArcGIS Online as the address will be scored down. What makes things even more confusing is the fact that a particular street may have a ‘box’ code and ‘street’ code which differ and both may not be accurately represented in the reference data being matched against. If you are going to include postal codes in your addresses to geocode, please ensure they all have four digits, otherwise ArcGIS Online will not recognise the postal code for what it is.

Preparing addresses for batch geocoding can be quite tedious, so we have created a toolbox to get you started with automating the process!

Python Toolbox

Clicking the image above will download an archive containing a toolbox with a simple Python script that uses a lookup table of freely available data from Statistics South Africa and the South African Post Office to attempt to normalise and clean address datasets prior to geocoding particularly for ArcGIS Online. You can use it in the same way you would use any other tool in ArcMap. Applying the 80/20 principal we have attempted to use the minimal amount of code in order to clean and normalise the majority of addresses, however each dataset is going to have its own nuances so it will be up to you modify the script in order to optimise it for each of your use cases.

If you’ve never used Python, don’t despair, the tool already does most of the heavy lifting for you and there is still much to be gained by adding text replacements and additional street types to the portions of the code indicated below. Simply navigate to the toolbox in an ArcCatalog window, right click on the script and select “Edit…” to be able to incorporate the additional records as and when required. If you would like to add additional functionality, some Python scripting knowledge will be advantageous.

Geocoding in AGOL
Adding additional entries to the following dictionary will allow for more control over the text replacements performed on the addresses being normalised
Geocoding in AGOL snippet
Adding additional street types to the following list will allow the script to identify the street address portion of more input addresses

Ultimately, the expectations for any geocoding exercise need to be realistically aligned with the quality of input address data. We must be aware that many datasets in South Africa still have a long way to go and with the dynamic nature of road networks there will always be gaps in the reference data used for geocoding, even in ArcGIS Online. It is up to us as the GIS users to ensure that we prepare our data correctly prior to geocoding in order to achieve the favourable results we seek.