Choose the right type of theme to suit your data

GIS Skills

 
Choose the right type of theme to suit your data
 
 
 

You’ve almost certainly seen many thematic maps. These kind of maps are common nowadays in newspapers or on the internet, whenever the intention is to make some quantitative comparison of data. For example, a map of a country that shows election results is a thematic map. A map of the world showing GDP (gross domestic product) for each country is a thematic map.

A common use of thematic maps in local government is to show assessed property values. Other examples would be a zoning map or a map showing school districts.

In AutoCAD Map 3D, you theme a layer based on a particular property or attribute. For example, you could theme a layer containing parcels according to the LAND_VALUE property of each parcel, or you could theme the same features by the AREA property. In this way, you can have multiple thematic layers in the same map, all based on the same features. Normally, of course, you only display one theme at a time (although you can create some useful maps by overlaying one semi-transparent theme over another—for example, a zoning layer on top of a parcels layer).

Equally important is the way that you stratify your data when you create a theme. You can get very different results depending on how you categorize and divide up your data. For this reason, it is a good idea to take a look at your data by making a chart of it. This will help you make a better decision about what kind of theme is most appropriate. You can generate a simple chart in a spreadsheet program, such as Microsoft Excel. The chart in the illustration below shows the distribution of values in the LAND_VALUE column of a parcels database.

TipTo get your data into a spreadsheet, you can export it in .CSV format. See Generate a report by exporting records to a spreadsheet.

The illustration below shows four maps of the same set of parcels. Each theme is based on the same LAND_VALUE property, and each one divides the data into six classes or ranges. However, each map uses a different method of separating the data into those six ranges.

These are the four methods:

  • Equal (upper left)—Creates ranges for the data at equal intervals. For example, if the parcels range in value from 0 to 1,200,000, this method will make six classes each one starting 200,000 higher than the one before.

    Pros: Easy to interpret as data ranges are all equal. Useful for comparing a series of maps using the same data ranges. Works best with continuously distributed data, such as temperatures or precipitation amounts.

    Cons: Does not consider how the data is actually distributed. Almost all data values may fall into one class, or there may be classes with no values in them. This is in fact what we see in the map above (upper left). Almost all the parcels are the same yellow color.

    The chart below shows how the six equal classes relate to the data in our chart. As you can see, the vast majority of the values fall into the first class, with values between 0 and 200,000.

  • Quantile (upper right)—Each range has the same number of features. For example, if there are 36,000 features, each range will have 6000 features assigned to it.

    Pros: Always produces a good-looking map with even dispersal of colors, as no class has too few or too many values in it.

    Cons: Can force similar values into different classes, or lump together dissimilar values into the same class. Implies that similar colors have similar values, when in fact this may not be the case. The chart below shows the parcel data distributed according to the quantile method.

  • Standard deviation (lower left)—Finds the mean value, then places class breaks above and below the mean at equal intervals. All normally distributed data show the same "bell curve" shape with the majority of values clustered around the mean value. Standard deviation tells you how spread out the values are from the mean value.

    Pros: Brings out the contrast in the data values by using the mean as a dividing point. Assuming an even number of classes is used, the mean of the data serves as the dividing point between an even number of classes above and below the mean.

    Cons: Requires a basic understanding of statistical concepts. May be difficult for users of the map to interpret. Only works on data that has a normal distribution (this is why it doesn’t work for our parcel data, which is heavily weighted towards one end of the range).

  • Natural breaks (lower right)—Ranges are determined based on statistically significant groupings in the data. Classes start and end where there are jumps in the data values.

    Pros: Closely reflects the actual distribution of the data values. Features having similar values are placed in the same class. Good for showing uneven distribution.

    Cons: The concept on which this classification is based may not be easily understood by all users of the map. The legend values for the class breaks, that is, the data ranges, may not be obvious as they are not even. Does not work well with data that is heavily weighted toward one end of the distribution.

    Examples

    Having analyzed our data according to the discussion above, we come to the conclusion that, of the four methods, Quantile will probably give us the best map for our data. Which method we choose in the end depends on what we want to examine or emphasize in our data, in other words, what is the purpose of the map and who is the intended audience?

    Whichever method we choose, the procedure for theming the layer is the same, as shown in the following demonstration.

    Show me how to theme a parcels layer