How to save over 70GB of hard drive space in one click!

Drives

Recently I found myself wondering where exactly all the space on my hard drive was going. One day it was there, and the next it was gone.

I did my usual Windows clean-up but still wasn’t happy with the outcome so I did a bit more exploring into the Esri side of things to see what could be done. And the answer, quite simply is, A LOT, with absolute minimal effort!

Today I am going to introduce you to a lesser known tool from the Data Management Toolbox (and definitely finding its way into my Top 10) called Compact.

The tool does what the name implies, specifically for file (and personal) geodatabases which we all characteristically have scattered across our hard drives.

The underlying architecture of these types of geodatabases relies on binary files – as you add, remove and edit data within the geodatabase these files become fragmented which ultimately decreases the performance of your database and takes up wasted space.

What compact does is rearrange how these files are stored on your disk, reducing the overall size and improving overall performance. WIN-WIN!

To explore just how much a difference this could possibly make, I wrote a script that could iterate through all of the directories on my computer, searching for these geodatabases to perform a compact operation on them. If you’re working with a specific feature class or a database is locked for whatever reason, the script will gracefully skip over it and continue on its hunt for free space in your directories. Your overall savings may vary based on the type of work you’re doing with your databases on a day-to-day basis, I personally saw a total of 70 GIGABYTES of data released back into the system. That’s a lot of 0s and 1s.

Geodatabase Compactor

I’ve made the script into a geoprocessing tool which you can download here. If you’re the more inquisitive type, you can right click on the tool in a Catalog window and click Edit to see the nuts and bolts – it’s a very good example of Python’s os.walk function to step through files and directories.

You can choose the nuclear option like I did, and scan an entire drive, or choose a specific directory for it to iterate through.

If you have background geoprocessing enabled, progress messages will be logged to the Results Window.

Depending on the number of geodatabases you have on your PC, the first run of the tool may take some time. Subsequent runs will be faster as your databases will already be optimised.

Happy space saving!

Check to see if a field exists using Python

Ever wanted to know if a certain field exists in a feature class or attribute table? This could be to either populate it with something if it does exist or create it first if it does not exist, then populate it. The easy steps below will show you how to check if a field exists. If it does not exist, it will be created then perform a field calculation on it.

First is the code (function) to check if a field exists (Note that the green text is purely some metadata about this function):

def fieldExists(dataset, field_name):
    """fieldExists(dataset, field_name)

       Determines the existence ofa field in the specified data object. Tests
       for the existence of a field in feature classes, tables, datasets,
       shapefiles, workspaces, layers, and files in the current workspace. The
       function returns a Boolean indicating if the element exists.

         dataset(String):
       The name, path, or both of a feature class, table, dataset, layer,
       shapefile, workspace, or file to be checked for existence of the
       specified field.

         field_name(String):
       The name of the field to be checked for existence"""

    if field_name in [field.name for field in arcpy.ListFields(dataset)]:
        return True

Next we will work with this code (known as calling this function) to check if a specific field name exists in our feature class. The path to our feature class is C:\data\MyData.gdb\TestFeatureClass and the field name we are going to check for is CATEGORY. These we will set in a variable as such:

featureclass = r"C:\data\MyData.gdb\TestFeatureClass"
fieldName = "CATEGORY"

Because the fieldExists function with return a Boolean of True, we can use an if statement to do something if it does
exist. We do that by using the following:

if fieldExists(featureclass, fieldName):

Now we need to do something if it does exist. For now we will just return a message to say that it does exit (if it actually does exist in the feature class):

    print ("Yes, {0} field exists in {1}".format(fieldName, featureclass))

If this field does exist in the feature class, the message returned will look like this:

image005
At the moment, if the field does not exist, no message will be shown. This also means that if it does not exist, you cannot do anything else. What we now need to do is write something to say that if the field does not exist in my feature class, I must do something else. This is done by using the else statement under the if statement (that’s logical, don’t you think). This is done like so:

else:

Pretty simple so far? Great!

Now we need to add a message to say that the field does not exist in the feature class:

   print ("No, {0} field does not exist in {1}".format(fieldName, featureclass))

If this field does not exist in the feature class, the message returned will look like this:
image008
After that and using the same indentation as the print statement you can now use something like arcpy.AddField_management() to add the missing field which needs to be populated.

The completed script looks like this (You can copy and paste the code below and re-use it in your own script):

import arcpy # don't forget to import arcpy

def fieldExists(dataset, field_name):
    if field_name in [field.name for field in arcpy.ListFields(dataset)]:
        return True

featureclass = r"C:\data\MyData.gdb\TestFeatureClass"
fieldName = "CATEGORY"

if fieldExists(featureclass, fieldName):
    print ("Yes, {0} field exists in {1}".format(fieldName, featureclass))
else:
    print("No, {0} field does not exist in {1}".format(fieldName, featureclass))
	# arcpy.AddField_management()