Sneak Peak from the past

So after the Weather Ready Nation: A Vital Conversation Workshop, I finally have some code and visualization software working. So here is a sneak peak, using the software Mondrian and an object identification algorithm that I wrote in Fortran, applied via NCL. Storm objects were defined using a double threshold, double area technique. Basically you set the minimum Composite Reflectivity threshold, and use the second threshold to ensure you have a true storm. The area thresholds apply to the reflectivity thresholds so that you restrict storm sizes (essentially as a filter to reduce noise from very small storms).

So we have a few ensemble members from 27 April generated by CAPS which I was intent on mining. The volume of data is large but the number of variables was restricted to some environmental and storm centric perspectives. I added in the storm report data from SPC (soon I will have the observed storms).

Slide1

In the upper left is a barchart of my cryptic recording of observed storm reports, below that is the histogram of hourly maximum surface wind speed, and below that is the integrated hail mixing ratio parameter. The two scatter plots in the middle show the (top) CAPE-0-6km Shear product versus the hourly maximum updraft helicity obtained from a similar object algorithm that intersects with the storm, and the (bottom) 0-1km Storm Relative Helicity vs the LCL height. The plots to the right show the (top) histogram of model forecast hour, (bottom) sorted ensemble member spinogram*, and (bottom inset) the log of the pixel count of the storms.

The red highlighted storms used a CASH value greater than 30 000 and UHobj greater than 50. So we can see interactively on all the plots, where these storms appear in each distribution. The highlighted storms represent 24.04 percent of the sample of 2271 storms identified from the 17 ensemble members over the 23 hour period from 1400 UTC to 1200 UTC.

Although the contributions from each member are nearly equivalent (not shown; cannot be gleaned from the spinogram easily), some members contribute more of their storms to this parameter space (sorted from highest to lowest in the member spinogram). The peak time for storms in this  environment was at 2100 UTC with the 3 highest hours being from 2000-2200 UTC. Only about half of the modeled storms had observed storm reports within 45km**. This storm environment contained the majority of high hail values though the hail distribution has hints of being bimodal. The majority of these storms had very low LCL heights (below 500 m) though most were below 1500m.

I anticipate using these tools and software for the upcoming HWT. We will be able to do next day verification using storm reports (assuming storm reports are updated via the WFO’s timely) and I hope to also do a strict comparison to observed storms. I still have work to do in order to approach distributions oriented verification.

*The spinogram in this case represents a bar chart where the length of the bar is converted to 100 percent and the width of the bar is the sample size. The red highlighting now represents the within category percentage.

**I also had to do a +/- 1 hour time period. An initial attempt to verify the tornado reports in comparison to the tornado tracks yielded a bit of spatial error. This will need to be quantified.

More Data Visualization

As jimmyc touched on in his last post, one of the struggles facing the Hazardous Weather Testbed is how to visualize the incredibly large datasets that are being generated. With well over 60 model runs available to HWT Experimental Forecast Program participants, the ability to synthesize large volumes of data very quickly is a must. Historically we have utilized a meteorological visualization package known as NAWIPS, which is the same software that the Storm Prediction Center uses for their operations. Unfortunately, NAWIPS was not designed with the idea it would be handling the large datasets that are currently being generated.

To help mitigate this, we utilized the Internet as much as possible. One webpage that I put together is a highly dynamical, CI forecast and observations webpage. This webpage allowed users to create 3, 4, 6, or 9 panel plots, with CI probabilities of any of 28 ensemble members, NSSL-WRF, or observations. Furthermore, users had the ability to overlay the raw CI points from any of the ensemble members, NSSL-WRF, or observations to see how the points contributed to the underlying probabilities. We even enabled it so that users could overlay the human forecasts to see how it compared to any of the numerical guidance or observations. This webpage turned out to be a huge hit with visitors, not only because it allowed for quick visualization of a large amount of data, but because it also allowed visitors to interrogate the ensemble from anywhere — not just in the HWT.

One of the things we could do with this website is evaluate the performance of individual members of the ensemble. We could also evaluate how varying the PBL schemes affected the probabilities of CI. Again, the website is a great way to sift through a large amount of data in a relatively short amount of time.

Visualization

We all did some web displays for various components of the CI desk. I built a few web displays based on object identification of precipitation areas. I counted up the objects per hour for all ensemble or all physics members (separate web pages) in order to 1) rapidly visualize the entire membership and 2) to add a non-map based perspective of when interesting things are happening. It also allows the full perspective of the variability in time, and variability of position and size of the objects.

The goal was to examine the models in multiple ways simultaneously and still investigate the individual members. This, in theory, should be more satisfying for forecasters as they get more comfortable with ensemble probabilities.  It could alleviate data overload by giving a focused look at select variables within the ensemble. Variables that already have meaning and implied depth. Information that is easy to extract and reference.

The basic idea as implemented was to show the object count chart and upon mousing over a grid cell you can call up a map of the area with the precipitation field. At the upper and right most axes, you call up an animation of either all the models at a specific time OR one model at all times. The same concept was applied to updraft helicity.

I applied the same idea to the convection initiation points only this time there were no objects, just the raw number of points. I had not had time to visualize this prior to the experiment, so we used this as a way to compare two of the definitions in test mode.

The ideas were great, but in the end there were a few issues. The graphics were good in some instances because we started with no precipitation or updraft helicity or CI points. But if the region already had storms then interpretation was difficult, at least in terms of the object counts. This was a big issue with the CI points, especially as the counts increased well above 400, for a 400 by 400 km sub domain.

Another display I worked hard on was the so-called pdf generator. The idea was to use the ensemble to reproduce what we were doing, namely putting our CI point on the map where we thought the first storm would be. Great in principle, but automating this was problematic because we could choose our time window o fit the situation of the day. The other complication was that sometimes we had to make our domain small or big, depending on how much pre-existing convection was around. This happened quite frequently so the graphic was less applicable, but still very appealing. It will take some refinement but I think we can make this a part of the verification of our human forecasts.

I found this type of web display to be very useful and very quick. It also allows us to change our perspective from just data mining to information mining and consequently to think more about visualization of the forecast data. There is much work to be done in this regard and I hope some of these ideas can be further built upon for visualization and Information Mining so they can be more relevant to forecasters.

Certainty, doubt, and verification

Today’s forecast on CI focused on the area from northeast KS southwest along a front down towards the TX-OK panhandles. It was straightforward enough. How far southwest will the cap break? Will there be enough moisture in the warm sector near the frontal convergence? Will the dryline serve as a focus for CI, given the development of a dry slot present just ahead of the dryline along the southern extent of the front and a transition zone (reduced moisture zone)?

So we went to work mining the members of the ensemble, scrutinizing the deterministic models for surface moisture evolution, examining the convergence fields, and looking at ensemble soundings. The conclusion from the morning was two moderate risk areas: one in northeast KS and another covering the triple point, dryline, and cold front. The afternoon forecast backed off the dryline-triple point given the observed dry slot and the dry sounding from LMN at 1800 UTC.

The other issue was that the dryline area was so dry and the PBL so deep that convective temperature would be reached but with minimal CAPE (10-50 J kg-1). The dry LMN sounding was assumed to be representative of the larger mesoscale environment. This was wrong, as the 00 UTC sounding at LMN indicated an increase in moisture by 6 g/kg aloft and 3 at the surface.

Another aspect to this case was our scrutiny of the boundary layer and the presence of open-cell convection and horizontal convective rolls. We discussed, again, that at 4km grid spacing we are close to resolving these types of features. We are close because of the scale of the rolls (in order to resolve them they need to be larger than 7times the grid spacing) which scales with the boundary layer depth. So a day like today where the PBL is deep, the rolls should be close to resolvable. On the other hand, there is a need for additional diffusion in light wind conditions and when this does not happen, the scale of the rolls collapses to the scale of the grid. In order to believe the model we must take these considerations into account. In order to discount the model, we are unsure what to look for besides indications of “noise” (e.g. features barely resolved on the grid, scales of the rolls being close to 5 times the grid spacing).

The HCRs were present today as per this image from Wichita:

ict_110608_roll

However, just because HCRs were present does not mean I can prove they were instrumental in CI. So when we saw the forecast today for HCRs along the front, and storms developed subsequently, we had some potential evidence. Given the distance from the radar, it may be difficult if not impossible, to prove that HCRs intersected the front, and contributed to CI.

This brings up another major point: In order to really know what happened today we need a lot of observational data. Major field project data. Not just surface data, but soundings, profilers, and low level radar data. On the scale of The Thunderstorm Project, only for numerical weather prediction. How else can we say with any certainty that the features we were using to make our forecast were present and contributing to CI? This is the scope of data collection we would require for months in order to get a sufficient amount of cases to verify the models (state variables and processes such as HCRs). Truly an expensive undertaking, yet one where a number of people could benefit from one data set and the field of NWP could improve tremendously. And lets not forget about forecasters who could benefit from having better models, better understanding, and better tools to help them.

I will update the blog after we verify this case tomorrow morning.

Wrong but verifiable

The fine resolution guidance we are analyzing can get the forecast wrong yet probabilistically verify. It may seem strange but the models do not have to be perfect, they just have to be smooth enough (tuned, bias corrected) to be reliable. The smoothing is done on purpose to account for the fact that the discretized equations can not resolve more than 5-7 times the grid spacing. It is also done because the models have little skill below 10-14 times the grid spacing. As has been explained to me, this is approximately the scale at which the forecasts become statistically reliable. An example forecast of a 10 percent probability, in the reliable sense, will verify 10 percent of the time.

This makes competing with the model tough unless we have skill at deriving not only similar probabilities, but placing those probabilities in close proximity in space-time relative to observations. Re-wording this statement: Draw the radar at forecast hour X probabilistically. If you draw those probabilities to cover a large area you wont necessarily verify. But if you know the number of storms, their intensity, their longevity, and place them close to what was observed you can verify as well as the models. Which means, humans can be just as wrong but still verify their forecast well.

Let us think through drawing the radar. This is exactly what we are trying to do, in a limited sense, in the HWT for the Convection Initiation and Severe Storms Desks over 3 hour periods. The trick is the 3 hour period over which the models and forecasters can effectively smooth their forecasts. We isolate the areas of interest, and try to use the best forecast guidance to come up with a mental model of what is possible and probable. We try to add detail to that area by increasing the probabilities in some areas and removing some for other areas.  But we still feel we are ignoring certain details. In CI, we feel like we should be trying to capture episodes. An episode is where CI occurs in close proximity to other CI in a certain time frame presumable because of a similar physical mechanism.

By doing this we are essentially trying to provide context and perspective but also a sense of understanding and anticipation. By knowing the mechanism we hope to either look for that mechanism or symptoms of that mechanism in observations in the hopes of anticipating CI. We also hope to be able to identify failure modes.

In speaking with forecasters for the last few weeks, there is a general feeling that it is very difficult to both accept and reject the model guidance. The models don’t have to be perfect in individual fields (correct values or low RMS error) but rather just need to be relatively correct (errors can cancel). How can we realistically predict model success or model failure? Can we predict when forecasters will get this assessment incorrect?

Adding Value

Today the CI (convective initiation) forecast team opted to forecast for northern Nebraska, much of South Dakota, southern and eastern North Dakota, and far west-central Minnesota for the 3-hr window of 21-00 UTC. The general setup was one with an anomalously deep trough ejecting northeast over the intermountain West. Low-level moisture was not all that particularly deep as a strong, blocking ridge had persisted over the southern and eastern United States for much of the past week. With that said, the strength of the ascent associated with the ejecting trough, the presence of a deepening surface low, and a strengthening surface front was such that most numerical models insisted that precipitation would break out across the CI forecast domain. The $64,000 question was, “Where?”.

One model in particular, the High-Resolution Rapid Refresh (HRRR) model insisted that robust storm development would occur across central and north-eastern South Dakota during the late afternoon hours. It just so happened that this CI episode fell in outside the CI team’s forecast of a “Moderate Risk” of convective initiation. As the CI forecast team poured over more forecast information than any single individual could possibly retain, we could not make sense of the how or why the HRRR model was producing precipitation where it was. The environment would (should?) be characterized by decreasing low-level convergence as the low-level wind fields responded to the strengthening surface low to the west. Furthermore, the surface front (and other boundaries) were well removed from the area. Still, several runs of the HRRR insisted storms would develop there.

It’s situations like this where humans can still improve upon storm-scale numerical models. By monitoring observations, and using the most powerful computers in existence (our brains), humans can add value to numerical forecasts. Knowing when to go against a model, or knowing when it is important to worry about the nitty-gritty details of a forecast, are important traits that good forecasters have to have. Numerical forecasts are rapidly approaching the point where on a day-to-day basis, humans are hard pressed to beat them. And, in my opinion, forecasters should not be spending much time trying to determine if the models are wrong by 1 degree Fahrenheit for afternoon high temperatures in the middle of summer in Oklahoma. Even if the human is correct and improves the forecast, was there much value added? Contrast this with a forecaster improving the forecast by 1F when dealing with temperatures around 31-33F and precipitation forecast. In this case the human can add a lot of value to the forecast. Great forecasters know when to to accept numerical guidance, and when there is an opportunity to improve upon it (and then actually improve it). Today, that’s just what the CI forecast team did. The HRRR was wrong in it’s depiction of thunderstorms developing in northeast South Dakota by 00 UTC (7 PM CDT), and the humans were right…

…and as I write this post at 9:30 PM CDT, a lone supercell moves slowly eastward across northeastern South Dakota. Maybe the HRRR wasn’t as wrong as I thought…

Tags: None

Timing

It is remarkably difficult to predict convection initiation. It appears we can predict, most times (see yesterdays post for a failure), the area under consideration. We have attempted to pick the time period, in 3 hour windows, and have been met with some interesting successes and failures. Today had 2 such examples.

We predicted a time window from 16-19 UTC along the North Carolina/South Carolina/ Tennessee area for terrain induced convection and along the sea breeze front. The terrain induced storms went up around 18 UTC, nearly 2 hours after the model was generating storms. The sea breeze did not initiate storms, but further inland in central South Carolina there was one lone storm.

The other area was in South Dakota/North Dakota/Nebraska for storms long the cold front and dryline. We picked a window between 21-00 UTC. It appears storms initiated right around 00 UTC in South Dakota but little activity in North Dakota as the dryline surged into our risk area.  Again the suite of models had suggested quick initiation starting in the 21-22 UTC time frame, including the update models.

In both cases we could isolate the areas reasonably well. We even understood the mechanisms by which convection would initiate, including the dryline, the transition zone, and the where the edge of the deeper moisture resided in the Dakotas. For the Carolinas we knew the terrain would be a favored location for elevated heating in the moist air mass along a weak, old frontal zone. We knew the sea breeze could be weak in terms of convergence, and we knew that only a few storms would  potentially develop. What we could not adequately do, was predict the timing of the lid removal associated with the forcing mechanisms.

It is often observed in soundings that the lid is removed via surface heating and moistening, via cooling aloft, or both processes. It is also reasonable to suspect that low level lifting could be aiding in cooling aloft (as opposed to cold advection). Without observations along such boundaries it is difficult to know what exactly is happening along them, or even to infer that our models correctly depict the process by which the lid is overcome. We have been looking at the ensemble of physics members which vary the boundary layer scheme, but today was the first day we attempted to use them in the forecast process.

It was successful in terms of incorporating them, but as far as achieving understanding, that will have to come later. It is clear that understanding the various structures we see, and relating them to the times of storm initiations will be a worthwhile effort. Whether this will be helpful to forecasting, even in hindsight, is still unknown.