Lak – The WDSS-II Blog

w2clustertable: a way to track storm properties over time

Posted in News on July 28, 2014 by Lak.

w2segmotionll provides a way to identify storm cells and track their properties over time so as to data mine storm attributes. But w2segmotionll is not the best place for that functionality because:

w2segmotion is also used to estimate motion from clusters, so as to create image nowcasts (using w2advector). Adding storm attribute generation makes w2segmotion a performance hog in real-time applications.
Storm tracking should not be limited to w2segmotion’s way of identifying clusters. Any object-identification method (not just the enhanced watershed approach we use) should be supported.

Following the WDSS-II philosophy, therefore, the cluster-table functionality of w2segmotion has been split off into a new program called w2clustertable. Instead of providing the -X option to w2segmotion, pass the XML file into the -X option of w2clustertable.

w2clustertable also takes, as input, a label grid (the “KMeans” output of w2segmotion) and motion estimates (the Motion_East and Motion_South) outputs of w2segmotion. But you could also use w2imgmotion to obtain cross-correlation-based motion estimates. And you can use any scheme to create a labeled set of storms (1,2,3…) to get their properties. It is not necessary for storm#4 in one frame to be storm#4 in the next frame (i.e., w2clustertable will associate storms using centroid-match or overlap or one of several built-in association methods).

If you use w2clustertable, please continue to cite our 2009 J. Tech paper: V. Lakshmanan and T. Smith, “Data mining storm attributes from spatial grids,” J. Ocea. and Atmos. Tech., vol. 26, no. 11, pp. 2353-2365, 2009

With this change, here are some related programs in the storm identification and tracking realm:

w2segmotionll uses K-Means clustering and Enhanced Watershed to identify storm cells at multiple scales.
(see: V. Lakshmanan, K. Hondl, and R. Rabin, “An efficient, general-purpose technique for identifying storm cells in geospatial images,” J. Atmos. Oceanic Technol., vol. 26, no. 3, pp. 523-37, 2009)
w2segmotionll uses these storms to compute motion estimates (see: V. Lakshmanan, R. Rabin, and V. DeBrunner, “Multiscale storm identification and forecast,” J. Atm. Res., vol. 67, pp. 367-380, July 2003.)
w2segmotionll and w2clustertable can both be used to track storm cells over time (see: V. Lakshmanan and T. Smith, “An objective method of evaluating and devising storm tracking algorithms,” Wea. Forecasting, vol. 25, no. 2, pp. 721-729, 2010.), but w2clustertable is now preferred.
w2segmotionll and w2clustertable can both be used to compute storm attributes but w2clustertable is now preferred. (see: V. Lakshmanan and T. Smith, “Data mining storm attributes from spatial grids,” J. Ocea. and Atmos. Tech., vol. 26, no. 11, pp. 2353-2365, 2009)
w2segmotionll and w2advectorll can both be used to create nowcasts of other fields from motion estimates obtained by tracking storms on one field but w2advectorll is the preferred way to do that. (see: V. Lakshmanan, R. Rabin, and V. DeBrunner, “Multiscale storm identification and forecast,” J. Atm. Res., vol. 67, pp. 367-380, July 2003.)
w2flatten will take the multiscale cluster table output by either w2segmotionll or w2clustertable and “flatten” it into a single table, to make multi-scale data mining possible (forthcoming paper by Humphrey, Lakshmanan, Smith, Smith and Thompson).

Merge radars into 3D Mosaic

Posted in Tips on July 3, 2014 by Lak.

Written by Bryan Burlingame [Part 3 of How to Create Reflectivity -10C]

Creating Cache for the Desired 3D Latitude and Longitude Grid

Create the cache for the 3D latitude and longitude grid to be utilized for all the radars using createCache. This will create a storage of data that the w2merger process (to be run later) can access. The information in the cache is relating to the size of the domain you wish to use, as well as which radars will be on that grid. The default directory for this cache is in your $HOME directory, and is titled .w2mergercache.

The process of creating the cache MUST BE DONE FOR ALL RADARS, here is an example of the command

createCache -o $HOME -i KXXX -T /path/to/terrain/KXXX.nc -t “50 -111 20” -b “27 -93 1” -s “0.05 0.05 1” –verbose

–o = creates the cache called “.w2mergercache in the $HOME directoy. w2merger will look in the $HOME directory for the cache, so best to leave it at $HOME (default is $HOME)
–i = the radar the cache is being created for (e.g. KXXX) MUST BE DONE FOR ALL RADARS
–T = the path to the terrain files so any terrain obstructions for a radar will be known. . You can obtain these data, if you do not already have them, from ftp://ftp.nssl.noaa.gov/users/lakshman/conus_radar_blockage.tgz.
–t = Top (and NW corner) of the of the new lat lon grid being created.
-t “50 -111 20” -b “27 -93 1”  -t “(north-Lat) (west-Lon) (vertical levels)”
–b = Bottom (and SE corner) of the of the new lat lon grid being created.
-b “27 -93 1”  -b “(south-Lat) (east-Lon) (lowest vertical level)”
–s = The spacing done between the grid points
1. -s “0.05 0.05 1”  -s “(Lat spacing) (Lon spacing) (vertical level spacing (km))”

MUST BE DONE FOR ALL RADARS, new radars will be added into the first cache created.
1. *THE –t –b –s variables used must be the same that are passed when using w2merger below

Merging Radar Data to Common Lat/Lon 3D Grid and Interpolating Reflectivity to the -10˚C Isotherm in RAP/RUC Data

The next 3 commands need to be run simultaneously, and in the order specified. This will merge the radar data and will also interpolate it to the RAP data (The shell script I created ran w2simulator in the background, sleep for 2 seconds, then started w2merger and w2segmotionll and worked). After you begin the simulator, it will prompt you to begin other algorithms; at this time, begin the w2merger and w2segmotionll.

w2simulator– This is the command that synchronizes the process and will read in all the various radar directories (AND THE RAP code_index.xml mentioned above), and output the data into index_N.fam directories which are used as the input for the w2merger.

N corresponds to the number of files read into the simulator, one is created per file.

w2simulator -i “/path/to/radar1/code_index.xml /path/to/radar2/code_index.xml /path/to/more/radars/code_index.xml /path/to/rap/code_index.xml” -o /path /simulation -b 20130515-000258 -e 20130515-010700 –verbose

–i = is the path to the radar xml files, and should also include the path to the RAP .xml file created using nse
–o = is the output path for where the code_N.fam files are put.
1. This is the input path for w2merger
–b = beginning time of the simulator
–e = end time of the simulator
1. Both –e and –b must be of the format (YYYYMMDD-HHMMSS)

w2merger– This is the command that pulls in the radar and RAP data from the various sources and merges it together onto the grid that was created using createCache. Note that the input in w2merger is the output from w2simulator.

w2merger -i “$COMMON/simulation/index_0.fam $COMMON /simulation/index_1.fam $COMMON /simulation/index_2.fam” -o $COMMON/merged -I ReflectivityQC:00.50 -M $COMMON/segmotion -e 60 -C 5 -t “50 -111 20” -b “27 -93 1” -s “0.05 0.05 1” -a Isotherms –verbose

–i = is the path to radar/RAP files that were created by running the w2simulator command. This should be same path as output path of w2simulator, linking to the index_N.fam files in that directory.
–o = is the out path of the now merged files (Should be used as input into w2segmotionll)
–I = (capitol i) is the variable you wish to merge (eg. ReflectivityQC:00.50 is the quality controlled 0.5 degree reflectivity)
–M = tells the merger to refer to and include the output from w2segmotionll. w2merger and w2segmotionll work together in a feedback process.
–e 60 = to make an output once every 60 seconds. Doing it without the -e option will cause a rapid update grid that your I/O hardware may not be able handle properly.
–C 5 = changes the way data values are combined if they overlap in both space and time. 5 (TimeAndDistanceWeighted) means that the overlapped data are weight with respect to both distance and time so that later data are weighted higher. This is the “common” selection
–t –b –s =SAME AS createCache. For meaning of these arguments, refer back to #8 (e-g). The values should be the same as the grid created in this step.
–a = to specify the algorithms you wish to run on the 3D grid.

w2segmotionll– This will get motion estimates from the data and feed it into w2merger. Optional, but was recommended within the user guide for multiple radar cases. It corrects for time differences between radar elevations scans (from the same or different radars) by advecting older data before blending.The output directory should be specified in w2merger by passing the –M flag.

w2segmotionll -i $COMMON /merged/code_index.fam -o $COMMON /segmotion -T MergedReflectivityQC -O 5 –verbose

–i = is the input path for the code_index.fam directory. This is created by the w2merger command (-o in w2merger). Although it is a directory, when entering the path in the w2segmotionll command, do not have the / at the end.
–o = is the output path of the motion estimates, once again is the same as the –M flag passed in w2merger
–T = this flag tells the w2segmotionll command what variable to take the motion estimates from.
–O = changes the time interval over which motion is computed. If O = 30, then frames at least 30 minutes apart are used in the motion estimation.

Creating QCed Radar Data

Posted in Tips on July 3, 2014 by Lak.

Written by Bryan Burlingame [Part 2 of How to Create Reflectivity -10C]

NEXRAD NCDC Data:

Download individual NEXRAD radar data from:

http://www.ncdc.noaa.gov/nexradinv/map.jsp

Or in bulk from:

http://has.ncdc.noaa.gov/pls/plhas/HAS.FileAppRouter?datasetname=6500&subqueryby=STATION&applname=&outdest=FILE

Once downloaded, untar and unzip any files if necessary. Then same as mentioned above organize the files so that only the files you actually need are in a directory.

Converting NEXRAD Level II Data to WDSS-II NetCDF

The steps below need to be done individually for all the radars that you plan to merge (i.e., If you have data from 3 radars, you must run ldm2netcdf (below) 3 times, once for each radar.)

I recommend reading the “quick usage” guides that are available via the WDSS-II website, or typing just the command into the terminal to see the other flags that can be passed. Below explains only the flags which we used.

The command needed to convert the NEXRAD data to WDSS-II NetCDF is ldm2netcdf. An example of the command issued is below.

ldm2netcdf -i /path/to/NEXRAD/data -a -1 -p KXXX -s KXXX -o /path/to/radar/NetCDF/ –verbose

–i = input radar directory (where the radar data were downloaded to)
–a = reads old LDM files in the input directory
–1 = (one) Not real time
–p = pattern recognition.. aka the radar site identifier (e.g. KXXX)
–s = the radar site (e.g. KXXX)
–o = the output directory (where you want your now NetCDF files to go, if directory does not exist, one will be created for you).

Use replaceIndex to create the needed configuration file code_index.xml.

replaceIndex -i /path/to/code_index.fam -o /path/to/code_index.xml

–i = path should the same as the output directory from ldm2netcdf, linking to the code_index.fam/ directory generated by that command
–o = should be the same path as the input here, except the output is code_index.xml

De-Aliasing NEXRAD Level II Velocity Data

In this step we will De-alias the radar velocity data.

The command used in this step is “dealias2d.” Here is an example of the command issued.

dealias2d -i /path/to/code_index.xml -o /path/to/radar/files -R KXXX –verbose

–i = path should the same as the output directory from replaceIndex ran in the previous step, linking to the code_index.xml file which is the configuration file for all the radar data
–o = should be the same output path as used above in ldm2netcdf.
–R = should be the same as the flag passed for –p –s in ldm2netcdf (e.g. KXXX)

Run replaceIndex on the code_index.fam directory in your NetCDF output directort. The command issued should be identical to the command issued after running ldm2netcdf. This will update the configuration file so it contains the newly de-aliased velocity data.

Quality Controlling NEXRAD Level II Reflectivity Data

This step will run Quality Control (QC) tests on the radar data to eliminate any noise and clutter. There are two different commands that can be used, and it depends on whether or not your radar data contains dual-pol.

QC the radar data using w2qcnn if you are using radar data that does not contain dual-polarization information or use w2qcnndp if using data that does contain dual-polarization information.
For w2qcnn, here is an example command:

w2qcnn -u -i xml:/path/to/code_index.xml -V Velocity -E /path/to/terrain/KXXX.nc -R 0.25×0.5×460 -o /path/to/radar/NetCDF –verbose

For w2qcnndp, here is an example command:

w2qcnndp -i xml:/path/to/code_index.xml -V Velocity -E /path/to/terrain/KXXX.nc -R 0.25×0.5×460 -s KRRR -o /path/to/radar/files – verbose

–u = creates virtual volumes (necessary for w2qcnn; do not use for w2qcnndp)
–i = path should the same as the output directory from replaceIndex ran in the previous step, linking to the code_index.xml file which is the configuration file for all the radar data
–V = Velocity – tells the program to use dealiased (rather than aliased) velocity in quality control
–E = (path) – tells the program where terrain data for each radar may be found for use in quality control
–R = 0.25×0.5×460 – tells the program that super-resolution Level II radar data is being quality controlled; if super-resolution Level II radar data are not being utilized, then set this to 1x1x460
–S = KRRR – provides the radar site identifier to the program
–o = path to the output file directory, should be the same output path as used above in ldm2netcdf and dealias2d.

Run replaceIndex on the code_index.fam directory in your NetCDF output directort. The command issued should be identical to the command issued after running ldm2netcdf and dealias2d. This will update the configuration file so it contains the newly QC’d data.

Getting RUC/RAP data into WDSS-II

Posted in Tips on July 3, 2014 by Lak.

Written by Bryan Burlingame [Part 1 of How to Create Reflectivity -10C]

To start, the first steps require the requesting, and downloading of large datasets. This process can be time consuming in both waiting for the NCDC order to come through, and also in downloading the data.

RAP/RUC Data:

Download RAP/RUC from:

http://nomads.ncdc.noaa.gov/data.php?name=access#hires_weather_datasets

After downloading the needed RAP data, organize the data into directories so that only the times you want processed are in a directory. The WDSS-II program reads in whole directories worth of data at a time. So to minimize the processing time, and also the size of your data, only keep the data need for the time of interest. Utilities such as wgrib or wgrib2, available from NCEP, are particularly useful for this task.

Converting RAP/RUC Data to WDSS-II NetCDF

There are two different processes for grib1 data and for grib2 data. Typically, RAP data will be in grib2 format and RUC data will be in grib1. In order for WDSS-II to read in RAP/RUC data, it needs to be in the WDSS-II NetCDF format. Below are the steps needed to convert both GRIB 1 & 2 to the required format.

GRIB 1

The command used to convert GRIB 1 to WDSS-II NetCDF is gribToNetcdf. Here is an example of the command:

gribToNetcdf -i /path/to/grb1 -o /creates/new/directory/NetCDF -k -a -1 -p 2013 -t “50 -111” -b “27 -88” -s “0.03 0.03” –verbose

–i = input directory of the GRIB 1 data.
–o = where you want the generated NetCDF files to go (creates directory if it DNE).
–k = tells command NOT to delete the original GRIB 1 files after processing.
–a = if on, will read the existing files in the directory on startup.
-1 = if on, will execute just once (not real-time).
–p = pattern recognition (if all files start with rap_130… you would use –p rap_130)
–t = the northwest corner of the domain you wish to use (Should be same as below with w2merger).
–b = the southeast corner of the domain you wish to use (Should be same as below with w2merger).
–s = the horizontal spacing of the domain

Run “replaceIndex” to make a .xml file from .fam directory. The code_index.fam will be the same as the outputDir specified in the griddataingest.xml file you copied over and edited.

replaceIndex -i /path/to/code_index.fam -o /path/to/code_index.xml

–i = input directory (path to code_index.fam/) of now WDSS-II NetCDF files (same as output directory used above in gribToNetcdf ).
–o = same directory as above, however the output is code_index.xml rather than .fam

Run “nse” to create various products that RAP/RUC does not have, but which WDSS-II needs to interpolate to the various Isotherms in the RAP/RUC data.

nse -i xml:/path/to/code_index.xml -o NSE/ -3 –verbose

GRIB 2

There are two options for converting GRIB 2 to the WDSS-II NetCDF. Option 1 (the recommended option) is to continue onto step 2 below. The other option is to convert the GRIB 2 data to GRIB 1 using the command: cnvgrib –g21 <inputfile> <outputfile>
1. cnvgrib readme: http://www.nco.ncep.noaa.gov/pmb/docs/grib2/download/cnvgrib.readme
2. After converting to GRIB 1, revert back to the previous section for converting GRIB 1 to WDSS-II NetCDF.

For GRIB 2, use the WDSS-II Java package to convert GRIB 2 to WDSS-II NetCDF. In steps (a.) and (b.), make sure the proper paths are declared.
1. export JAVA_HOME=”/usr/bin”
2. export WDSSII_INSTALL_DIR=”/path/to/wdssii/wdssiijava”

Next you need to copy the file wdssiijava/example/griddataingest.xml into a directory of your choice, and rename it rap.xml or ruc.xml (any name .xml works). Next, edit the newly copied .xml file. At the top of the file change the inputDir to where your data are, and outputDir to where you want it to go. Also change the filenamePatterns to match some part of your input files (if all files start with rap_130… you would use rap_130 as your filenamePattern). You also will want to uncomment the listVariables entry and set it to true.

Now run the following command:

w2java.sh org.wdssii.ncingest.GridDatasetIngest /path/to/copied/griddatasetingest.xml

This will create your configuration file, varList.xml. This file contains all of the meteorological variables in your RAP/RUC data. By editing this file you can change which variables are processed in the new NetCDF files.

Next, edit griddatasetingest.xml that you copied and renamed so to set listVariables to false.

Now run the following command: (Yes, same as above)

w2java.sh org.wdssii.ncingest.GridDatasetIngest /path/to/copied/griddatasetingest.xml

Run “replaceIndex” to make a .xml file from .fam directory. The code_index.fam will be the same as the outputDir specified in the griddataingest.xml file you copied over and edited.

replaceIndex -i /path/to/code_index.fam -o /path/to/code_index.xml

–i = input directory (path to code_index.fam/) of now WDSS-II NetCDF files (same directory as outputDir mentioned above).
–o = same directory as above, however the output is code_index.xml rather than .fam

Run “nse” to create various products that RAP/RUC does not have, but which WDSS-II needs to interpolate to the various Isotherms in the RAP/RUC data.

nse -i /path/to/code_index.xml -o /path/to/NSE/ -3 –verbose

Interpolating Reflectivity to -10C Level in RAP/RUC Data

Posted in Tips on July 3, 2014 by Lak.

Written by Bryan Burlingame

WDSS-II Guide to Interpolating Reflectivity to -10˚C Level in RAP/RUC Data

The process of obtaining data from multiple radars, quality controlling the data, merging the various radars to a common three dimensional latitude and longitude grid, and then interpolating that radar reflectivity to the -10˚C level found in RAP/RUC data is explained here. This process was conducted and then this document created by Bryan Burlingame (burling6@uwm.edu) and Dr. Clark Evans at the University of Wisconsin-Milwaukee for the purpose of identifying Convective Initiation (CI) in hours immediately following Mesoscale Predictability Experiment (MPEX) research flights.

Lak’s note:

The entire document is being posted in three pieces for easier searchability. The steps to create the Reflectivity-10C product are:

Get RUC/RAP data into WDSS-II
Create QCed radar data from Level-II
Mosaic radar data taking into account temperature level

Thanks to Bryan for sending me these instructions.

Creating Rotation Tracks using WDSS-II

Posted in News on April 9, 2014 by Lak.

How do you go about creating rotation tracks starting from Level-II radar data from NCDC?

The entire process is described in M. Miller, V. Lakshmanan, and T. Smith, “An automated method for depicting mesocyclone paths and intensities,” Wea. Forecasting, vol. 28, pp. 570-585, 2013

If you use Rotation Tracks in your research, please cite the above paper and also cite the papers for each of the following steps.

Untar the Level-II data and place it somewhere. Let’s call this directory RAWDIR
Get terrain netcdf data for your radar. You can get terrain files for US radars from ftp://ftp.nssl.noaa.gov/users/lakshman/conus_radar_blockage.tgz. Untar this, and let’s call this directory TERRAIN
Decide where you want your output products to go. Let’s call this DATADIR.
Define a variable RADAR to hold your radar identifier (e.g. KBMX)
Run ldm2netcdf to convert the Level-II data into NetCDF
QC the radar reflectivity data. Note that I am assuming that you have don’t have dualpol (if you do have dualpol, you should w2qcnndp) and that you do have super-resolution (if you have 1km resolution, change -R accordingly)
Dealias the velocity data
Compute Azimuthal Shear
Run w2merger to put the data on a LatLonGrid
Run w2accumulator with QC to create the rotation tracks

Here’s a script that will carry out the entire process. Edit as needed.

#!/bin/sh

RAWDIR=`pwd`/raw
RADAR=KBMX

# Overall reference about the entire process
# M. Miller, V. Lakshmanan, and T. Smith, ``An automated method for depicting mesocyclone paths and intensities,'' Wea. Forecasting, vol. 28, pp. 570-585, 2013. 

TERRAIN=~/WDSS2/gtopo30/radars/$RADAR.nc
DATADIR=`pwd`/$RADAR

# (5) convert Level-II to netcdf
# V. Lakshmanan, T. Smith, G. J. Stumpf, and K. Hondl, ``The warning decision support system - integrated information,'' Wea. Forecasting, vol. 22, no. 3, pp. 596-612, 2007.
ldm2netcdf -i $RAWDIR -o $DATADIR -s $RADAR -p $RADAR -a -1 --verbose
replaceIndex -i $DATADIR/code_index.fam -o $DATADIR/code_index.xml

# (6) note: if you have dualpol data, use w2qcnndp instead of w2qccn. The rest of the command-line is the same
# V. Lakshmanan, A. Fritz, T. Smith, K. Hondl, and G. J. Stumpf, ``An automated technique to quality control radar reflectivity data,'' J. Applied Meteorology, vol. 46, pp. 288-305, Mar 2007
# V. Lakshmanan, C. Karstens, J. Krause, and L. Tang, ``Quality control of weather radar data using polarimetric variables,'' J. Atm. Ocea. Tech., vol. 0, p. 0, 2013. 
w2qcnn -i $DATADIR/code_index.xml -o $DATADIR -R 0.25x0.5x460 -s $RADAR -E $TERRAIN -u --verbose
replaceIndex -i $DATADIR/code_index.fam -o $DATADIR/code_index.xml

# (7) note: if you have sounding information, provide it. the results will be better
# Jing and Wiener 1993
dealias2d -i $DATADIR/code_index.xml -o $DATADIR --verbose
replaceIndex -i $DATADIR/code_index.fam -o $DATADIR/code_index.xml

# (8) run LLSD
# Smith and Elmore 2004 
w2circ -i $DATADIR/code_index.xml -o $DATADIR -a -w -z ReflectivityQC -Z 20 -D -t -c -L "0:2:1.0:7.5:AGL  3:6:0:90:AGL" -V "0.5 250 920" -G $RADAR -g $TERRAIN --verbose
replaceIndex -i $DATADIR/code_index.fam -o $DATADIR/code_index.xml

# (9) run w2merger to put the data on a cartesian grid
# V. Lakshmanan, T. Smith, K. Hondl, G. J. Stumpf, and A. Witt, ``A real-time, three dimensional, rapidly updating, heterogeneous radar merger technique for reflectivity, velocity and derived products,'' Wea. Forecasting, vol. 21, no. 5, pp. 802-823, 2006. 
# V. Lakshmanan and T. W. Humphrey, ``A MapReduce technique to mosaic continental-scale weather radar data in real-time,'' IEEE J. of Select Topics in Appl. Earth Obs. and Remote Sensing, vol. 0, no. 0, 2013.
TOP=`grep -A 2 $RADAR ~/WDSS2/src/w2/w2config/misc/radarinfo.xml | head -2 | tail -1 | sed 's/[=\"]/ /g' | awk '{print $3+4,$5-4}'`
BOT=`grep -A 2 $RADAR ~/WDSS2/src/w2/w2config/misc/radarinfo.xml | head -2 | tail -1 | sed 's/[=\"]/ /g' | awk '{print $3-4,$5+4}'`
echo "$TOP to $BOT"
w2merger -i $DATADIR/code_index.xml -o $DATADIR -I AzShear_0-2kmAGL -p 0.001 -e 60 -C 1 -R 230 -t "$TOP 1" -b "$BOT 0" -s "0.005 0.005 1" --verbose
replaceIndex -i $DATADIR/code_index.fam -o $DATADIR/code_index.xml

# (10) run w2accumulator with QC
# V. Lakshmanan, M. Miller, and T. Smith, ``Quality control of accumulated fields by applying spatial and temporal constraints,'' J. Atmos. Ocean. Tech., vol. 30, pp. 745-757, 2013. 
w2accumulator -i $DATADIR/code_index.xml -o $DATADIR -R -s -t "60 120 360" -C 1 -O RotationTrack -t 120 -Q blob:0.002:0.005:25:azshear,mht:1:2:1800:5:1 -g MergedAzShear_0-2kmAGL --verbose
replaceIndex -i $DATADIR/code_index.fam -o $DATADIR/code_index.xml

w2polygon2csv: A way to extract statistics

Posted in Tips on March 25, 2014 by Lak.

w2polygon2csv is a very useful WDSS-II tool for researchers. This post shows an example of how to use it.

The question I want to answer: “What are the polarimetric characteristics of gust fronts”?

The methodology is to:

collect a set of polarimetric radar data that have gust fronts
draw polygons around the gust fronts using wg
Use w2polygon2csv to extract all the pixel data within the polygons
Write a R program to do linear regression of the parameters to identify gust fronts
write a program to draw histograms

Step 1: Data collection and ingest

I downloaded the necessary Level-II data from NCDC’s HAS website , untarred it and put it all in a directory named “raw”. The directory listing looks like this.

Note that I have put all the files (even those from different radars) into the same directory.

While you could ingest them all separately into separate directories for each of the radars, it is more convenient (scripting-wise) to fool downstream algorithms by pretending that all the data comes from the same radar (I’ll assume KOUN). This is done by the following call to ldm2netcdf:

rm -rf KOUN/netcdf; ldm2netcdf -i `pwd`/raw -o `pwd`/KOUN/netcdf -s KOUN -a -1 -p gz --verbose; makeIndex.pl `pwd`/KOUN/netcdf code_index.xml

After this, you will have a directory called KOUN/netcdf within which will be netcdf files corresponding to Reflectivity, Zdr, etc.

Step 2: Drawing polygons

Start up wg and select Controls | Readout. Change the output directory in the unmarked textbox (the default is probably /tmp) to the location where you want the polygon files saved:

Bring up your data product (in this case, Reflectivity). Click on Add and then click around the gustfront to draw a polygon around the gustfront. If you have multiple gust fronts, use the Done button to finish one polygon and start another. When you are finally done, click Save:

At this point, a polygon XML file is written to the output directory. It will have a name like GustFront_Reflectivity_20120709-003408.xml where GustFront is the name of the data source in wg, Reflectivity the name of the product and the rest of the filename is the time stamp.

Move to the next image, click Reset to clear the existing polygon from the display and draw the set of polygons corresponding to the new time step. Repeat for all your data cases (you can skip time steps if you want).

Step 3: Extract pixel data using w2polygon2csv

It is helpful to extract data both from within the polygons (“gustfronts”) and from outside the polygons (“not gustfronts”). The raw data that I want to collect are Reflectivity, AliasedVelocity, RhoHV and Zdr. I can also collect any other data as long as I have an algorithm that will produce that data in polar format.

Run w2polygon2csv as follows:

#!/bin/sh

makeIndex.pl `pwd`/KOUN/netcdf code_index.xml

rm -rf csv; mkdir -p csv/gf csv/ngf
INPUTS="Reflectivity:00.50 AliasedVelocity:00.50 RhoHV:00.50 Zdr:00.50"
NUMINPUTS=`echo $INPUTS | wc -w`
w2polygon2csv -i `pwd`/KOUN/netcdf/code_index.xml -I "$INPUTS" -o `pwd`/csv/gf -p `pwd`/polygons -F 60 -N $NUMINPUTS --verbose
w2polygon2csv -i `pwd`/KOUN/netcdf/code_index.xml -I "$INPUTS" -o `pwd`/csv/ngf -p `pwd`/polygons -F 60 -N $NUMINPUTS -V --verbose

After the above commands are run, we will have a set of comma-separated-values (CSV) files in csv/gf corresponding to gustfronts and another set of CSV files corresponding to not-gustfronts in csv/ngf.

Step 4: Create a Linear Regression Model to identify gustfronts

If you are using R, you should combine the two sets of files, making to sure add a new column with the value 1 for gustfront pixels and 0 for not-gustfront pixels. Like this:

head -q csv/gf/* | head -1 | awk '{print $0",GustFront"}' > csv/train.csv
cat csv/gf/*.csv | grep -v Time | awk '{print $0",1"}'  >> csv/train.csv
cat csv/ngf/*.csv | grep -v Time | awk '{print $0",0"}' >> csv/train.csv

The very first line (head) simply carries over the column information from the original CSV files. The reason for the “grep -v” on the second and third lines is to remove this header line from all subsequent output. At this point, you will now have a “train.csv” with all the pixel data and you can use R’s read.csv to read it and process it.

The R program to do the linear regression:

data <- read.csv("csv/train.csv")
data <- data[ data$Reflectivity < 30, ]
data <- data[ data$RhoHV < 0.9, ]

numgustfronts <- length( data[data$GustFront > 0.5,1] )
numnegatives  <- length( data[data$GustFront < 0.5,1] )
weights <- data$RhoHV
weights[ data$GustFront > 0.5 ] <- 0.5/numgustfronts
weights[ data$GustFront < 0.5 ] <- 0.5/numnegatives

model2 <- lm( GustFront ~ abs(AliasedVelocity - 6) + ChangeInReflectivity + abs(Kdp_approx) + abs(Reflectivity - 13) + RhoHV + abs(Zdr), data=data, weights=weights )

You may notice that I have additional columns in my output — I used existing WDSS-II tools (beyond the scope of this blog) to produce a ChangeInReflectivity (over time) and to approximate the Kdp.

Step 5: Draw histograms using Python

You could use R itself to draw histograms, but these days, I prefer using Python, and so I will read the individual files from within the Python program itself. My Python program, for simplicity, will read and plot only the gustfront csv files (repeat for the non-gustfront data).

The first part of the Python program reads in the data:

#!/usr/bin/python
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import os

(VEL,DELREF,FZAGG,KDP,REF,RHOHV,ZDR,TIME,LAT,LON,NUMDATACOLS)=range(0,11)

PREFIX="gf"; TITLE="gustfronts"
#PREFIX="ngf"; TITLE="non-gustfronts"

indir="/scratch2/lakshman/gustfront/csv/" + PREFIX

# read all the files in the above directory ...
files = os.listdir(indir)
alldata = np.empty([0,NUMDATACOLS])
for file in files:
  try:
     # read file, but convert last-but-two element (which is time) to zero
     data = np.loadtxt(indir + "/" + file, comments="#", delimiter=",", unpack=False, skiprows=1, converters={(NUMDATACOLS-3): lambda s: 0} )
     # remove any points that do not have a valid value
     sel = data[:,0] > -999
     for col in xrange(1,NUMDATACOLS):
        sel = np.logical_and(sel, data[:,col] > -999)
     sel = np.logical_and(sel, data[:,RHOHV] < 0.99 )  # rhohv valid value < 1
     data = data[sel]
     print np.shape(data)
     alldata = np.row_stack( (alldata, data) )
  except:
     print "Ignoring " + file

As I read in the data, I append it to an array called “alldata”, so at the end of the loop, my “alldata” contains all the valid (non-missing) data in the CSV files.

The next step is to define two Python functions, one to compute the nth percentile of the data and the other to plot a nice-looking histogram:

def percentile(a, P):
  # sort a
  a = np.sort(a)
  n = int( round(P*len(a) + 0.5) )
  if n > 1:
     return a[n-2]
  else:
     return a[0]

def plothist(a, title):
  global nrows, ncols, plotno
  axes = plt.subplot(nrows,ncols, plotno)
  weights = np.ones_like(a) / len(a)
  axes.hist(a, numbins, facecolor='green', normed=False, alpha=0.5, weights=weights)
  p = percentile( np.abs(a), 0.1 )
  plt.plot( [p,p], axes.get_ylim(), 'b--' )
  pos = axes.get_ylim()[0] * 0.25 + axes.get_ylim()[1] * 0.75
  axes.annotate( np.around(p,2),
                 xy=(p, pos), xycoords='data',
                 xytext=(50,30), textcoords='offset points',
                 arrowprops=dict(arrowstyle="->",
                      connectionstyle="angle3,angleA=90,angleB=0") )
  p = percentile( np.abs(a), 0.5 )
  plt.plot( [p,p], axes.get_ylim(), 'b--' )
  pos = axes.get_ylim()[0] * 0.75 + axes.get_ylim()[1] * 0.25
  axes.annotate( np.around(p,2),
                 xy=(p, pos), xycoords='data',
                 xytext=(50,30), textcoords='offset points',
                 arrowprops=dict(arrowstyle="->",
                      connectionstyle="angle3,angleA=90,angleB=0") )
  p = percentile( np.abs(a), 0.9 )
  plt.plot( [p,p], axes.get_ylim(), 'b--' )
  axes.annotate( np.around(p,2),
                 xy=(p, np.mean(axes.get_ylim())), xycoords='data',
                 xytext=(-50,30), textcoords='offset points',
                 arrowprops=dict(arrowstyle="->",
                      connectionstyle="angle3,angleA=90,angleB=0") )
  axes.set_ylabel("Frequency")
  axes.set_xlabel(title)
  #plt.title("Distribution of " + title)
  plotno = plotno + 1

Then, simply call the histogram function once per variable:

# process the data
data = alldata
print "Total: ",np.shape(data)

plotno=1
nrows=3
ncols=3
numbins=20

plt.figure(figsize=(15,12))

plothist( np.abs(data[:,VEL]), '|Velocity| (m/s)')
plothist( data[:,REF], 'Reflectivity (dBZ)' )
plothist( data[:,RHOHV], 'RhoHV' )
plothist( np.abs(data[:,ZDR]), '|Zdr| (dB)')
plothist( np.abs(data[:,KDP]), '|Kdp| (deg)')
plothist( data[:,DELREF], 'Change in Reflectivity (dBZ)' )
plothist( data[:,FZAGG], 'Fuzzy Aggregate')

fig = plt.gcf()
fig.suptitle("Polarimetric characteristics of " + TITLE, fontsize=24)
#plt.show()
plt.savefig( PREFIX + 'stats.png') #, bbox_inches='tight')

The final output looks like this:

Coordinates in WDSS-II

Posted in Tips on February 11, 2014 by Lak.

There seems to be some confusion about the geolocation of WDSS-II’s grids. First of all, most of the grids produced by WDSS-II algorithms are in Plate Carree (or equirectangular) projection for the reasons ably set forth in this cartoon.

Suppose you were to ask w2merger to make you a grid from radar data and you specify the top (northwest) corner with -t and bottom (southeast) corner with -b and spacing with -s as (35,-97), (34.97, -96.97) and (0.01,0.01) respectively. You would then get this 3×3 grid:

The grid you get if you ask w2merger for -t "35 -97" -b "34.97 -96.97" — The grid you get if you ask w2merger for -t “35 -97” -b “34.97 -96.97”

I have found that if you consider that all pixels occupy a definite area of the earth, the above representation becomes very logical. It is also intuitive in that there are 3 pixels between 35 degrees and 34.97 degrees at a spacing of 0.01 degrees.

In the netcdf files output by WDSS-II, you will find that the northwest corner and the grid spacing for the above grid would be encoded as (35,-97) and (0.01,0.01).

So, are the pixels in WDSS-II defined by their north-west corners? Unfortunately, no. For that, you have to take into account that while a pixel occupies a certain area, it has only one value. Which location within the pixel does that value correspond to? The value of a bin is the average value within the region covered by that bin.

The answer to the second question leads to some tricky semantics. Before we get to those, let’s move on from the world geographic system to the projected coordinate system of the grid itself (see ArcGIS for an explanation of the difference). Because the projection in question is Platee Carree, the transformation is a simple linear one between pixel coordinates and latitude-longitude, but such a transformation exists. For this coordinate system, the (0,0) point is the center of the northwest grid point. This is needed so that we can think of a pixel’s value as being the average value within the bin if we had somehow had infinite resolution. The grid’s coordinate system, to put a picture to it, is like this:

A couple of things may warrant noting. The first coordinate (the slower-changing one) is the latitude direction and the second coordinate (the faster changing one) is the longitude one. In other words, grid values are written starting from the northwest corner in rows. Confusingly, the first coordinate (the “vertical” one if you are staring at an image) is called the x-axis in the sparse-grid netcdf format (“pixel_x”) and is the coordinate we ask for first on all command-lines that ask for a position or length. [Side note: This is because my background is in image processing and linear algebra where this right-handed coordinate system is common. By the time I figured out that meteorologists and computer graphics used “x” for the “horizontal” dimension, it was too late and there was too much code written with the matrix notation firmly in place.]

The two definitions above are very intuitive, and if you don’t think much about it, you will probably end up doing the right thing. But just to make sure you are thinking about this the right way, try to answer this question. Given the grid above, what is the value at the location denoted by a star in this diagram?

To get the value at the location denoted by the star, you would interpolate the 4 pixels closest to the star and assign weights to those values that depend on the distance between the star and the centers of those pixels — To get the value at the location denoted by the star, you would interpolate between the values at the 4 pixels closest to the star. The weights of each of these values would depend on the distance between the star and the centers of the corresponding pixels

To find the nearest neighbor to a point (lat,lon) you would start by computing floor( (nwlat-lat)/deltalat ) to get the first pixel coordinate and floor( (lon-nwlon)/deltalon ) to get the second pixel coordinate. If you wished to interpolate, you would also compute the ceil() besides computing the floor() so that you get the four pixels in question. Then, you would compute the distance of the stars from the pixel centers to get distance-weights and then compute a weighted average of the four values.

Radar coverage

Posted in Tips on January 21, 2014 by Lak.

What fraction of the US population is within x kilometers of a weather radar?We can answer this question for the lower 48 states from the w2merger caches, a grid of US population density and a digital elevation map (terrain) data. The WDSS-II program is called coverageStats and I ran it like this:

coverageStats -i $HOME/.w2mergercache/_55.000_-130.000_500.000___NMQWD_0.010_0.010___33_3500_7000/ -E conus/conusterrain.nc -P conus/nap10ag.asc.gz -o `pwd`/coverage -h 0:5 --verbose

I am defining that some place is covered by a weather radar if it is scanned by that radar at a height of below 5km above ground level. The population density data is in Esri Grid format from Columbia University. The digital elevation data is from the USGS (the gtopo30 dataset) and has been converted to netcdf using the WDSS-II tool topoBrea. Radar coverage information comes from the MRMS CONUS 1km resolution cache (created using createCache).

Here’s what the result looks like:

$pop_mindist_frac$ 70% of the US population is covered by a weather radar that is less than 100 km away. A little less than 20% of the US population is not covered, or is covered by a radar beam that is at a height more than 5 km. Note that these numbers take beam-blockage into account.

Here’s a map of what area is covered at what height.

One thing to realize is because I started with the MRMS cache, parts of Canada are included in these statistics.

If you want to try out different assumptions (What if I drop radar X? Do not consider Department of Defense radars? Use height of 3km to 5km? etc.), feel free to run coverageStats yourself.

How to get the -t and -b given a radar name

Posted in Tips on December 17, 2013 by Lak.

Lots of WDSS-II algorithms (w2merger, w2pngconvert, etc.) ask for a top-left and bottom-right corner. Perhaps you are processing single radar data in a script and want to automatically determine what to specify for -t and -b …

Here’s a sequence of UNIX commands that you can use to obtain the -t and -b given a radar name:

TOP=`grep -A 2 $RADAR ~/WDSS2/w2config/misc/radarinfo.xml | head -2 | tail -1 | sed 's/[=\"]/ /g' | awk '{print $3+4,$5-4}'`

BOT=`grep -A 2 $RADAR ~/WDSS2/w2config/misc/radarinfo.xml | head -2 | tail -1 | sed 's/[=\"]/ /g' | awk '{print $3-4,$5+4}'`

Try it with your favorite radar. I have it in bash syntax, but obviously, you can put this in pretty much any script.

The -4 and +4 at the end indicate that we are going 4 decimal degrees (approximately 400 km) from the radar center. Obviously, you can/should change that depending on your needs.

The WDSS-II Blog

News and tips for users of WDSS-II (a suite of severe weather algorithms)

Author: Lak

w2clustertable: a way to track storm properties over time

Merge radars into 3D Mosaic

Creating QCed Radar Data

Getting RUC/RAP data into WDSS-II

Interpolating Reflectivity to -10C Level in RAP/RUC Data

Creating Rotation Tracks using WDSS-II

w2polygon2csv: A way to extract statistics

Step 1: Data collection and ingest

Step 2: Drawing polygons

Step 3: Extract pixel data using w2polygon2csv

Step 4: Create a Linear Regression Model to identify gustfronts

Step 5: Draw histograms using Python

Coordinates in WDSS-II

Radar coverage

How to get the -t and -b given a radar name