Now that I’ve outlined how “Truth Event” scoring is accomplished, how well did it do for the Tuscaloosa-Birmingham long-track supercell event from 27 April 2011? Recall that a “truth event” is defined as a continuous time period a specific grid point is under a warning(s) and/or a tornado observation(s) surrounded by at least one minute of neither. What are the various quantities that can be calculated for each truth event? Beginning with these values:
twarningBegins = time that the warning begins
twarningEnds= time that the warning ends
tobsBegins = time that the observation begins
tobsEnds= time that the observation ends
Given these various times, the following time measures can be calculated for each truth event:
LEAD TIME (lt): tobsBegins – twarningBegins [HIT events only]
DEPARTURE TIME (dt): twarningEnds – tobsEnds [HIT events only]
FALSE TIME (ft): twarningEnds – twarningBegins [FALSE events only]
Then we can take all the truth events and add up the number of HIT EVENTS, FALSE EVENTS, MISS EVENTS, and CORRECT NULL EVENTS, and put them into our 2×2 contingency table. As before, I am using a 5 km radius of influence (“splat”) around the one-minute tornado observations, not applying the Cressman distance weighting to the truth splats, and I’m using the composite reflectivity filtering for the correct nulls:
First note that the raw numbers in this 2×2 table are much smaller than those from the “Grid Point” method of scoring. Recall that the first scoring method counted each grid point as a single data point at each one minute time step. For the “Truth Event” scoring, multiple grid values at multiple time steps are combined into a single truth event. Also note that there are no truth events categorized as a MISS EVENT. That means that every grid point within a 5km radius of the two tornado paths were at one period during the event covered by a warning. Remember that there was a 2-minute gap in the warnings when the tornado was southwest of Tuscaloosa. However, since those grid points were eventually warned, they were considered HIT EVENTs, but their lead time ends up being negative.
Here are the various numbers for the truth events:
POD = 1.0000
FAR = 0.8029
CSI = 0.1971
HSS = 0.2933
When comparing these to the grid point style of scoring, there seems to be improvement in all areas. But note that these are based the fact that each truth event is considered equal, no matter how long that event was. A one-minute false alarm and a 60-minute false alarm are each counted as one false alarm. Sounds like one of our original traditional warning verification pitfalls. But we have information about the time history of each truth event, and can extract even more information out of them. This is where the time measures come in. Computing the average of the various time measures for all grid points:
Average Lead Time (lt) = 22.9 minutes
Average Departure Time (dt) = 15.2 minutes
Average False Time (ft) = 39.8 minutes
Now we can get a more complete picture of how well the warnings did for all of the specific geographic locations. From the NWS Warning verification data base, the average lead time for all the one-minute segments for both tornadoes with this storm is 22.1 minutes. This is very close to that number above, because our gridded verification data also has one minute intervals, but we are also counting grid points within 5 km of the tornado at each minute, which increases the number of data points by about 80x. Also remember that the ground truth I used was more spatially-precise than a straight line connecting the start and end positions of the tornado, and more temporally-precise in that the one-minute locations are not based on an equal division of the straight path between end points.
Regarding DEPARTURE TIME, this is new metric that can be calculated. In this case, each grid point affected by the 5km tornado “splat” remains, on average, under a Tornado Warning for an extra 15.2 minutes even though the tornado threat has already cleared.
And with FALSE TIME, we can now extract information back out of the truth event numbers to tell us that our warnings may be too large or too long. In this case, of the grid points warned but not affected by the tornado, on average, these grid points were “over-warned” by 39.8 minutes. And to get a representation of the approximate False Alarm Area, 10,304 square kilometers of ground were falsely warned for at least one minute.
In the next blog post, we will dissect the truth event statistics a little more, looking at various numerical and geospatial distributions.
Greg Stumpf, CIMMS and NWS/MDL