HWT EFP – Page 12 – The Experimental Forecast Program

I know it when I see it and other discussions

Posted on May 11, 2011 by James Correia Jr..

We had many discussions over the last two days. One was regarding CI definitions. Of the variety of opinions we heard, a storm was defined by:
1. Whether lightning occurred,
2. a coherent, continuous thunderstorm that eventually reached a significant low level reflectivity threshold (40-45 dBz) within 30 minutes,
3. any combination of 1 or 2, which also produced severe weather (e.g. it was just a storm, but a severe storm)

These variations on the theme are exactly what we were considering for the experiment, the experimental algorithms from the model, and the forecast verification we had played with prior to the experiment.

Another conversation involved what forecasters would use from the experimental suite of variables. The variables would need to be robust, easy to interpret (e.g. quick to interpret and understand), and clear. This is a tough sell from the research side of things, but it is totally understandable from a forecaster perspective. Forecasters have limited time in an environment of data overload in which to extract (or mine) information from the various models. They have very specific goals too, from nowcasting (e.g. 1-2 hours; especially on days like today where models miss a significant component of what is currently happening), to forecasting (6-24 hours), to long lead forecasting 1-8 days.

We also spent some time discussing the Tuesday short wave trough in terms of satellite data and radiosonde data. I argued that many people believe that satellite data and its assimilation is much more important now (Data volume, coverage, and quality control) than radiosondes. It was mentioned that radiosondes are very important on the mesoscale especially in the 0-24 hour possibly 48 hour forecasts. Still more opinions were expressed that some forecasters have questioned the need for twice a day soundings. Opinions in the HWT ranged from soundings are important, to soundings should be launched more often, to sounding should be launched more often at different times. It is plausible that some of our NWP difficulty may be due to launching soundings at transition times of the boundary layer.

I am of the opinion that if model suites are launched 4 times per day that soundings should be launched at least 4 times per day, especially now where cycling data assimilation is common practice. This would return our field to the 1950’s era where 4 times per day soundings were launched at 3,9, 15, and 21 UTC.

Lastly, we discussed the issue of what happens when a portion of the forecast domain is totally out to lunch? Like today where the NM convection was not represented. I think I will talk about that tomorrow once we verify our forecast in the OK area from today. Stay Tuned.

It’s complicated

Posted on May 11, 2011 by James Correia Jr..

As expected, it was quite a challenge to pick domains for days 2 and 3. Day 2 was characterized by 3 potential areas of CI: Ohio to South Carolina, Minnesota and Iowa, and Texas. We were trying to determine how to deal with pre-existing convection: whether it was in our domain already or would be in our domain during our assumed CI time. As a result, we determined that the Ohio to South Carolina domain was not going to be as clean-slate as Texas or Minnesota. So we voted out SC.

We were left with Texas (presumed dryline CI) and Minnesota (presumed warm front/occlusion zone). Texas was voted in first but we ended up making the MN forecast in the afternoon. Data for this day did not flow freely, so we used whatever was available (NSSL-WRF, operational models, etc).

The complication for TX was an un-initialized short wave trough emanating from the subtropical jet across Mexico and moving northward. This feature was contributing to a north to south band of precipitation and eventually triggered a storm in central and eastern OK, well to the east of our domain. The NSSL WRF did not produce the short wave trough and thus evolved eastern TX much differently than what actually occurred despite having the subtropical jet in that area. So we were gutsy in picking this domain despite this short wave passing through our area. We were still thinking that the dryline could fire later on but once we completed our spatial confidence forecast (a bunch of 30 percents and one 10 percent) and our timing confidence (~+/- 90 minutes) it was apparent we were not very confident.

This was an acceptable challenge as we slowly began to assemble our spatial forecast, settling on a 3 hour period in which we restrict ourselves to worrying only about new, fresh convection by spatially identifying regions within our domain where convection is already present. This way we don’t have to worry about secondary convection directly related to pre-existing convection. We also decided that every forecaster would enter a spot on the map where they thought the first storms would develop (within 25 miles of their point). This makes the forecast fun and competitive and gets everyone thinking not just about a general forecast but about the scenario (or scenarios if there are multiple in the domain).

The next stop on this days adventure was MN/IA/Dakotas. This was challenging for multiple reasons:
1. The short wave trough moving north into OK/KS and its associated short wave ridge moving north northeast
2. the dryline and cold front to the west of MN/IA,
3. the cold upper low in the Dakotas moving east north east.

The focus was clear and the domain was to be RWF. This time we used a bigger domain in acknowledgement of the complex scenario that could unfold. You had the model initiating convection along the warm front, along the cold front in NE on a secondary moisture surge associated with the short wave trough, and a persistent signal of CI over Lake Superior (which we ignored).

We ended up drawing a rather large slight risk extending down into IA and NE from the main lobe in MN with a moderate area extending from south central MN into northern IA. After viewing multiple new products including simulated satellite imagery (water vapor and band differencing from the NSSL WRF and the Nearcast moisture and equivalent potential temperature difference, it was decided that CI was probably with everyone going above 50 percent confidence.

In Minnesota we did quite well, both by showing a gap near Omaha where the moist surge was expected but did not materialize until after our 0-3 UTC time period. Once the moisture arrived … CI. In MN CI began just prior to 23 UTC encompassing some of our moderate risk even down into IA, yet these “Storms” in IA were part of the CI episode but would not be objectively classified as storms from a reflectivity and lifetime perspective, but they did produce lightning.

The verification for Texas was quite bad. Convection formed to the east early, and to the west much later than anticipated associated with a southern moisture surge into NM from the upper level low migrating into the area nearly 11 hours after our forecast period start.

As it turns out, we awoke this morning to a moderate risk area in OK, but the NM convection was totally missed by the majority of model guidance! The dryline was in Texas still but now this convection was moving toward our CDS centerpoint and we hoped that the convection would move east. A review of the ensemble indicated some members had some weak signals of this convection, but it became obvious that it was not the same. We did key in on the fact that despite the missed convection in the TX panhandle the models were persistent in secondary initiation despite the now-developing convection in southern TX. We outlooked the area around western OK and parts of TX.

In the afternoon, we looked in more detail at the simulated satellite imagery, nearcast, and the CIRA CI algorithm for an area in and around Indiana. This was by far the most complicated and intellectually stimulating area. We analyzed the ensemble control member for some new variables that we output near the boundary layer top (1.2 km AGL roughly): WDT: the number of time steps in the last hour where w exceeded 0.25 m/s and convergence . We could see some obvious boundaries as observed, with a unique perspective on warm sector open celled convection.

In addition we used the 3 hour probabilities of CI that have been developed specifically for CI since these match our chosen 3 hour time periods. We have noticed significant areal coverage from the ensemble probabilities which heavily weight the pre-existing convection CI points. Thus it has been difficult to assign the actual new CI probabilities since we cant distinguish the probability fields if two close proximity CI events are in the area around where we wish to forecast. That being said, we have found them useful in these messy situations. We await a clean day to see how much a difference that makes.

Day 1 in the 2011 HWT EFP

Posted on May 9, 2011 by James Correia Jr..

What a great start to the HWT. There were troubles, and troubleshooters. We had plenty of forecasters and plenty of forecast problems. All in all it was quite a challenge.

The convection initiation (CI) team had some great discussion on the CI definition including all the ways in which CI gets complicated. For example, visually we can identify individual clouds, or cloud areas on satellite. When using radar, we might select areas of high reflectivity that last for say 30 minutes. In the NWP models, we rely on quantitative values at a single grid point at two instances in time.

We also have the issue of whether CI is part of a larger episode (close in space and/or time by other storms) or developing as a direct result of previous convection (ahead of a squall line). In these relative cases, visually identifying new storms might be easily accomplished, but in the model atmosphere (in a grid point centric algorithm) new CI points may be all over the place, say as gravity waves or outflow achieve just enough separation to be classified as new (thus CI) even though it might simple be redevelopment. From a probability standpoint, spatial probabilities of CI may thus be larger around existing convection. Does this enhanced probability, ahead of the line, signal actual new storm development?

Trying to establish an apples to apples comparison between model and human forecasts of such discrete events is a major challenge. We are testing 3 model definitions of CI to see their viability from the perspective of forecasters, and we will also evaluate object based approaches to CI.

Of course we cannot talk about where CI might be without talking about when! When will the first storm form? This gets back to your definition of CI. Should the storm produce lightning to be classified a storm? How about reaching a threshold reflectivity? How about requiring it that it last a certain amount of time? The standard definition of storms relies on its mode (ordinary, multicell, supercell); all having a unique evolution with the placement of the updraft and precipitation fall out. But what about storm intensity (however you define it)?

I should also acknowledge that defining all of this can be quite subjective and is relevant to individual users of a CI forecast. So we are definition dependent, but most people know it when they see it. Lets consider two viewpoints: The severe storm forecaster and an aviation forecaster. The severe storm forecaster wants to know about where and when a storm may form so they can decide the potential threat thus leading to a product (mesoscale discussion for specific hail, wind, tornado threats) provided that storm or CI episode is long lived. The aviation forecaster might be concerned with the sudden appearance of cumulonimbus which could pose an immediate threat to aircraft. But they are also concerned with the resulting coverage of new storms (diverting traffic, shutting down airports, planning new traffic routes or patterns) and the motion, expansion, and decay of existing storms.

And lastly it will be important for us to establish what skill the models and forecasters have with respect to CI. This is not a new area of study, but it is one where lots of complexity, vagaries of definitions, and also a lack of understanding contribute to making this one of the greatest forecast challenges.

As we refine what our forecast will consist of, we will report back on how our forecast product evolved. The more we forecast, the more we learn.

Comparing Experimental QPF Outlooks with Hi-res model/radar data

Posted on May 21, 2010 by NSSL Scientists & Collaborators.

As part of the 2010 HWT Spring Experiment (EFP), two new forecast components were added; Aviation and QPF. After analyzing various high-resolution (hi-res) models, outlooks were created for QPF expected to exceed 0.50 and 1 inch for two time periods (18-00 UTC and 00-06 UTC) for the Day 1 period. In the image below, you will see an example of one of these hi-res (WRF-HRRR) models 6-hour total precipitation overlayed with a “SLIGHT RISK” threat area for QPF exceeding 0.50 inch in the same 6-hour period. Each day during the Experiment, the morning forecast (threat area) was completed by 1530 UTC.

A screen capture of the latest composite reflectivity data is attached to show how the forecast is verifying to this point.

A “SLIGHT RISK” threat area is defined by 25 percent of the threat area expected to reach or exceed a specific amount (i.e. 0.50 inch).

David Nadler
Warning Coordination Meteorologist
NWS Huntsville AL

Another review from the week of May 18-24

Posted on July 13, 2009 by NSSL Scientists & Collaborators.

First, as others have mentioned, special thanks to Steve and Jack for hosting the 2009 Spring Experiment. It’s of tremendous benefit to be able to focus on meteorology and consideration of many factors without the distractions of email/phones. Here is a reproduction of portions of a summary I prepared for my forecast staff:

There are considerable issues regarding mesoscale (and potentially stormscale) models being able to reproduce observed convective initiation, morphology, and evolution. Whether it be near-term assistance in high impact decision support services, or long term initiatives such as Warn on Forecast, the future success of improving services beyond detection systems (radar/satellite) depends on the ability of models to portray reality. Current challenges for models include: 1. ability to resolve features, even at high (1 km) resolution; 2. computational limitations; 3. initial conditions and data assimilation; 4. remaining poor understanding of physics and representation of physics in models; and 5. verification of non-linear, object-like features. Identifying where focus should be placed in improving the models is one main goal I found with the EFP. For example, given some set amount of computational resources, should one very high resolution model be run, or an ensemble of lower resolution models? What observations/assimilation/initial conditions seem most critical? Is a spatially correct forecast of a squall line, but with a timing error of 3 hours a “good” forecast? Is it a better forecast than a poor spatial representation of the squall line (e.g., a blob) with spot-on timing?

The main challenge we faced was literally a once in 30-year event of poor convective conditions across the CONUS during the week – the first time SPC did not issue a watch for that corresponding week since 1979, the previous low being 10 watches for the same week in 1984! We did have pulse convection and a few isolated supercells in eastern Montana, eastern Wyoming, and western Nebraska, but it was a challenge for the models to even develop (or in some cases overdevelop) convection. After reflecting on our outlooks, verification, and evaluation/discussion of the models, some general conclusions can be drawn:

1. Mesoscale models are a long way from consistently depicting convective initiation, morphology, and evolution. Lou Wicker and Morris Weisman estimate it might be 10-15 years before models are at the level where confidence in their projections would be enough to warn for storms before they develop, presuming the inherent chaos of convection or lack of initial conditions/data assimilation even allows predictability. Progress will surely be made in computational power, better initial conditions (e.g., radar/satellite/land surface), and to some extent model physics, but there will remain a significant role for forecasters in the foreseeable future.

2. Defining forecast quality is extremely difficult when considering individual supercells, MCSs, and other convective features. What may be an excellent 12 hour outlook forecast of severe convection for a County Warning Area could be nearly useless for a hub airport TAF forecast. Timing and location are just as critical as meteorologically correct morphology of convection. Ensembles may be able to distinguish the most likely convective mode, but offer only modest assistance in timing and location.

3. The best verifying models through a 36-hour forecast seem to be those with 3-4 km grid spacing, “lukewarm” started with radar data physically balanced through the assimilation process. The initialization/assimilation scheme seems to have more of an influence than differences in the model physics. Ensemble methods (portraying “paintball splats” of a particular radar echo or other variable threshold), seem to offer some additional guidance beyond single deterministic runs, although it’s very hard to assess the viability/quality of the solutions envelope when making an outlook or forecast. The Method for Object-based Diagnostic Evaluation (MODE) is a developing program for meaningful forecast verification, based on the notion of objects (i.e., a supercell, an MCS) rather than a grid point based scheme.

Overall, the EFP experience was personally rewarding – an opportunity to get away from typical WFO operations and into applied research. The HWT facility and SPC/NSSL staff were fantastic and made for a high-level scientific, yet relaxed, environment. I strongly encourage anyone interested in severe convection from a forecast/outlook/model perspective to consider the EFP in future years.

-Jon Zeitler
NWS Austin/San Antonio, TX

A Few Thoughts Regarding the HWT Spring Experiment – Mike Fowle

Posted on June 11, 2009 by NSSL Scientists & Collaborators.

As others have mentioned, I want to thank the SPC and NSSL for coordinating the program once again this year. As a first year participant, I found the experience both challenging and rewarding. Although the overall magnitude/coverage of convective weather continued to be generally below normal – there were still plenty of forecast challenges to keep us busy throughout the week.

Now for some observations (mainly subjective) about a few of the issues we encountered:

1. Having completed a verification project on an early version of the MM5 (6KM horizontal grid spacing) back in the late 1990s, it was interesting to see the current evolution of mesoscale modeling.While there have been many changes/improvements (e.g. microphysics schemes, PBL schemes, radar assimilation, etc) it was evident to me that many of the same problems we encountered (sensitivity to initial conditions, sensitivity to model physics, parameterization of features, upscale growth, etc) still haunt this generation of mesoscale models.

2. With the increase in computer power, we are now able to run models with a horizontal grid resolution of 1km over a large domain in an operational setting!However, examining the 1km output did not seem to add much (if any) value over 4km models – especially considering the extra computational expense.

3. All of the high resolution models still appear to struggle when the synoptic scale forcing is weak.In other words, modeling convective evolution dominated by cold pool propagation remains extremely challenging.

4. The output from the high resolution models remains strongly correlated to that of the parent model used to initialize. Furthermore, if you don’t have the synoptic scale conditions reasonably well forecast, you have little hope in modeling the mesoscale with any accuracy.

5. Not surprising, each model cycle tended to produce a wide variety of solutions (especially during weak forcing regimes) – with seemingly little continuity amongst individual deterministic members (even with the same ICs), or from run to run. Sensitivity to ICs and the lack of spatial and temporal observations on the mesoscale remains a daunting issue!

Even with some of these issues, on most days the high resolution models still provided valuable guidance to forecasters – most notably regarding storm initiation, storm mode, and overall storm coverage.Although the location/timing of features may not be exactly correct, seeing the overall “character of the convection” can still be of great utility to forecasters especially considering they are not available in the current suite of operational models (i.e. NAM/GFS).

From a field office perspective – one of the big challenges I see in the future is how to best incorporate high resolution model guidance into the “forecast funnel.” Being that many forecasters already feel we are at (or even past!) the point of data overload, they need proof that these models can be of utility in the forecast process. Moreover, I believe that on an average day, most forecasters can/will devote at most 30-60 min to interrogate this guidance. Is this sufficient time? During the experiment we were devoting a few hours to evaluating the models – and I still felt we were only scratching the surface.

Next, what is the best method to view the data? A single deterministic run? Multiple deterministic runs? Probabalistic guidance from storm scale ensembles? Model post products (i.e. surrogate severe)? Some combination of the above? In addition, what fields give forecasters the most bang for the buck? Simulated reflectivity, divergence, winds, UH, updraft/downdraft strength? Obviously many of these questions have yet to be answered, however what is clear to me is that significant training is going to be required regarding both what to view, and how to view it.

In terms of verification, the object based methodology that DTC is developing is an interesting concept. Although still in its infancy, I like the idea and do see some definite utility. However, as we noted during the evaluation, it still appears as though this methodology may be best suited for a “case study” approach rather than an aggregate (i.e. seasonal) evaluation (at least at this point).

As echoed by others, it was a privilege to be a participant in this year’s program and I would jump at the opportunity to attend in future years. In my humble opinion, I think the mesoscale models have proved long ago that they do have utility in the forecast process – if used in the proper context. There are obvious challenges to embark upon in the years to come, and I look forward to seeing the continued evolution of techniques/technology in future years.

11-15 May 2009 Spring Experiment

Posted on June 8, 2009 by NSSL Scientists & Collaborators.

I would like to thank SPC, NSSL, and others for the invitation to participate in the 2009 HWT. As most of you are aware (either through discussions with me or your experiences sitting at an airport for hours waiting for a delayed flight) the socioeconomic impacts of convective weather on the aviation industry are substantial. Attending the HWT is a way to grasp where the edge of the science is and establish a reality check for myself and to share with others as we work towards NextGen. It is an intriguing experience to learn from both operational forecaster feedback and our own forecasts we developed at HWT that the solution to increased accuracy does not correlate to higher model resolution as some might believe. Having an HWT week with an aviation focus is a good way for this research to get some exposure in a capacity that it may not have been designed for but illustrates potential utility and benefit.

I have pointed out things like Simulated Reflectivity from the HWT last year that has gained attention based on potential utility in using the data at a high level to get an understanding of what the National Airspace System scenario might look like for the day. Although it may never verify due its deterministic nature and is somewhat noisy it is good visual aid for an Air Traffic Flow Manager (non meteorologist) to get a quick frame of reference on potential systemic impacts.

Other forecasts like the Probability of >=40dbz intensity within 25 miles of point and the 18 member ensemble for 40dbz intensity could have enormous value for the aviation industry as 40dbz is also the level of intensity that aircraft no longer penetrate thus causing deviations and ultimately delays. Research on convective mode was another area I found myself intrigued with as convective mode from an aviation perspective provides a frame of reference to determine the permeability of the convective constraint. Discreet cells and linear convection can be equally disruptive but would be managed very differently if forecast with a high degree of skill so modal info for aviation is equally important as location and timing.

Thanks for a great week!
John

Reid Hawkins’ view of the June 1-5 HWT Spring Experiment

Posted on June 7, 2009 by NSSL Scientists & Collaborators.

As I sit here at the Will Rogers Airport in Oklahoma City waiting on a plane that is 2 hours late, I wanted to reflect on my experiences with the 2009 HWT Spring Weather Experiment. This reflection will be in more in the style of stream of consciences so I hope someone out there can follow it.

1^st a well deserved round of applause for Steve W, Jack, and Matt on steering us through the plethora of numerical models and the objective verification techniques of DTC. Our week started off with a rather well behaved and straight forward event over the northern Mississippi Valley into the northern plains. The second day was a highly frustrating forecast over Oklahoma, Kansas, and Northern Texas where overnight convection and gravity wave played havoc to the forecast and lack of convection over Oklahoma. The third day was even more frustrating with a weaker forced case south of an east-west stationary front from northern Virginia and back to Kentucky. The final case was a high plains case from Wyoming and Nebraska southward to the Texas Panhandle.

For our week of evaluating the models, my first impression was the number of models that provided a whole host of solutions. Through experience from the staff, they steered us to look at the reflectivity fields, outflows and updrafts instead of digging ourselves into a myriad of model fields that no one could have possibly looked at in the short time we had to prepare a forecast outlook. After shifting my paradigm to this style of forecasting which was somewhat uncomfortable, it was a comfortable feeling when we saw similar results from the models. This was not a common event as most of the cases were marginal or weakly forced.

One concern I have is that I did not see a huge bang for the money in the 1 km Caps model runs vs. the 4 km Caps models. There was a huge discussion about the assimilation of the data into the models and my thoughts are that until we sample the atmosphere with higher resolution, frequency and more accurately then I do not see where the higher scale models will provide better results for forecast operations. This is just an opinion and I hope the modelers can prove me wrong.

Another concern, I have is the way the data is displayed to the forecaster. With the wealth of data that is available and our current display techniques, I am afraid this has or will force many forecasters to find a comfort level of what data types to use. This means there may be valuable data sets to view but due to comfort level and time constraints these data sets may never be used.

Overall, the week was extremely enlightening on seeing the techniques that are being developed to help the forecaster. In time as the development envelop is pushed, I expect to see great information delivered to the operational desk. I am somewhat disheartened but not surprised on the lack of help we saw in weakly forced environments.