The GEH Statistic is a formula used in traffic engineering, traffic forecasting, and traffic modelling to compare two sets of traffic volumes. The GEH formula gets its name from Geoffrey E. Havers, who invented it in the 1970s while working as a transport planner in London, England. Although its mathematical form is similar to a chi-squared test, is not a true statistical test. Rather, it is an empirical formula that has proven useful for a variety of traffic analysis purposes.
The formula for the "GEH Statistic" is:
GEH=\sqrt{ | 2(M-C)2 |
M+C |
Where M is the hourly traffic volume from the traffic model (or new count) and C is the real-world hourly traffic count (or the old count)
Using the GEH Statistic avoids some pitfalls that occur when using simple percentages to compare two sets of volumes. This is because the traffic volumes in real-world transportation systems vary over a wide range. For example, the mainline of a freeway/motorway might carry 5000 vehicles per hour, while one of the on-ramps leading to the freeway might carry only 50 vehicles per hour (in that situation it would not be possible to select a single percentage of variation that is acceptable for both volumes). The GEH statistic reduces this problem; because the GEH statistic is non-linear, a single acceptance threshold based on GEH can be used over a fairly wide range of traffic volumes. The use of GEH as an acceptance criterion for travel demand forecasting models is recognised in the UK Highways Agency's Design Manual for Roads and Bridges[1] the Wisconsin microsimulation modeling guidelines,[2] the Transport for London Traffic Modelling Guidelines [3] and other references.
For traffic modelling work in the "baseline" scenario, a GEH of less than 5.0 is considered a good match between the modelled and observed hourly volumes (flows of longer or shorter durations should be converted to hourly equivalents to use these thresholds). According to DMRB, 85% of the volumes in a traffic model should have a GEH less than 5.0. GEHs in the range of 5.0 to 10.0 may warrant investigation. If the GEH is greater than 10.0, there is a high probability that there is a problem with either the travel demand model or the data (this could be something as simple as a data entry error, or as complicated as a serious model calibration problem).
The GEH formula is useful in situations such as the following:[4] [5] [6]
The GEH statistic depends on the magnitude of the values. Thus, the GEH statistic of two counts of different duration (e.g., daily vs. hourly values) cannot be directly compared. Therefore, GEH statistic is not suitable for evaluating other indicators, e.g., trip distance.[7]
Deviations are evaluated differently upward or downward, so the calculation is not symmetrical.
Moreover, the GEH statistic is not without a unit, but has the unit (s−1/2 in SI base units).
The GEH statistic does not fall within a range of values between 0 (no match) and 1 (perfect match). Thus, the range of values can only be interpreted with sufficient experience (= non-intuitively).
Furthermore, it is criticized that the value does not have a well-founded statistical derivation.
An alternative measure to the GEH statistic is the Scalable Quality Value (SQV), which solves the above-mentioned problems: It is applicable to various indicators, it is symmetric, it has no units, and it has a range of values between 0 and 1. Moreover, Friedrich et al. derive the relationship between GEH statistic and normal distribution, and thus the relationship between SQV statistic and normal distribution. The SQV statistic is calculated using an empirical formula with a scaling factor :
By introducing a scaling factor , the SQV statistic can be used to evaluate other mobility indicators. The scaling factor is based on the typical magnitude of the mobility indicator (taking into account the corresponding unit).
Indicator | Order ofmagnitude | Scaling factor | |
---|---|---|---|
Number of person trips per day (total, per mode, per purpose) | 100 | 1 | |
Mean trip distance in kilometers | 101 | 10 | |
Duration of all trips per person per day in minutes | 102 | 100 | |
Traffic volume per hour | 103 | 1,000 | |
Traffic volume per day | 104 | 10,000 |
However, the SQV statistic should not be used for the following indicators:
Friedrich et al. recommend the following categories:
SQV statistic | GEH statistic(with f = 1,000 and c = 1,000) | Evaluation | |
---|---|---|---|
0.90 | 3.4 to 3.6 | Very good match | |
0.85 | 5.4 to 5.8 | Good match | |
0.80 | 7.5 to 8.5 | Acceptable match | |
(Since the GEH statistic is not symmetrical,the same absolute deviation of a measured value upwards and downwards are evaluated differently) |
The survey of mobility indicators or traffic volumes is often conducted under non-ideal conditions, e.g. large standard deviations or small sample sizes. For these cases, a procedure was described by Friedrich et al. that integrates these two cases into the calculation of the SQV statistic.