Map Rooms
Map Rooms are web-accessible tools targeted at particular user-groups, the end result of a process which evaluates user-group needs and builds tools that helps address those needs. These tools preselect data and analyses suitable for the task, building an easy-to-use framework for addressing the users' immediate needs, as well as providing links that allow the user to quickly download the data into the user group's standard tools for further analysis. While there are now many map rooms and hundreds of map room pages, several stand out in their operational use by their user groups.
The Malaria Early Warning System (MEWS) Map Room (Figures 1a,b, (Grover-Kopec et al. 2005) is utilized because where malaria is not adequately controlled, its distribution and seasonality are driven by various climate factors such as temperature, humidity and rainfall. By knowing when conditions are suitable for transmission of malaria, health officials are granted several weeks, sometimes months of warning to apply insecticides, stockpile medicines and alert hospitals. The MEWS maps illustrate models of climate suitability for seasonal endemic malaria, and recent climate conditions, such as rainfall anomalies, which may be associated with epidemic malaria in warm semi-arid regions of Africa. It is used by national malaria-control program personnel in Africa. Data used include CPC/Famine Early Warning System Dekadal Estimates, NASA MODIS vegetation, and analyses based on NOAA NCEP/NCAR CDAS-1 Reanalysis and CPC Merged Analysis of Precipitation.
The Desert Locusts Map Room (Figures 2a,b, (Ceccato et al. 2007a, Ceccato et al. 2006) is utilized because swarms of desert locusts can travel thousands of miles and can threaten the food security and livelihoods of up to one fifth of the world’s population. Recent plagues caused an estimated $400 million in damages and affected 8.4 million people. Knowing when and where environmental conditions are right for these insects to multiply helps authorities control their numbers. The map room shows maps and analysis products illustrating recent climate conditions, such as rainfall and vegetation growth, which provide ideal breeding conditions for the locusts. It is used by the U.N. FAO and regional locust-control workers. Data used include NOAA CPC CMORPH precipitation and NOAA MODIS vegetation.
The Indonesian Fire Map Room (Figure 3, (Someshwar et al. 2010) is based on research on peatland fires in the Indonesian province of Central Kalimantan that has uncovered a close correlation between satellite rainfall data and fire hotspot activity. In particular, rainfall during the dry season from June to October is critical in determining fire incidence. This finding means such data can help indicate whether an upcoming fire season will be more or less intense than usual, and can help authorities take preventive measures to avoid impacts to biodiversity, public health and global greenhouse gas emissions. The fire map room shows ten-day precipitation estimates for Indonesia; graphs that show the relationship between the number of fires and the NINO4 index in the previous month for the four Kalimantan provinces. It is available in English and Indonesian Bahasa. It is used by provincial environment, forestry and meteorological agencies. Data used include NOAA CPC CMORPH.
The IFRC Map Room (Figure 4) addresses the problem where in responding to disasters such as cyclones, floods and other weather-related events, humanitarian organizations must decide when and where to send aid. Determining which areas are likely to be hit first or hardest by an event can mean the difference between life and death. Also critical is the prediction of disaster “hotspots”, or areas at high-risk because of their location and the vulnerability of their populations (e.g., a densely populated flood plain.) It shows the relative severity of forecast rainfall events, 1–6 days in advance; “predictions in context” maps showing where seasonal forecasts indicate enhanced chance for continuation/reversal of previously observed rainfall; and population and poverty maps. It is used by the International Federation of Red Cross and Red Crescent Societies’ operations-support department. Data used include the NOAA ESRL PSD Reforecast, NOAA CPC Merged Analysis of Precipitation, IRI Seasonal Forecasts, and CIESIN gridded population.
Other maprooms permit access to sophisticated analyses. The Time Scales Map Room (Figure 5, (Greene et al. 2011) presents a decomposition by time scale of twentieth-century precipitation and temperature variations. It sheds light on the characteristics of historical temperature and precipitation variability, in the process clarifying the potential utility of different types of climate information in the context of anticipated climate-related risks that will tend to vary as well, with slower variations modulating the likelihood of adverse or beneficial events that play out on shorter timescales. Three scales are defined and correspond to i) secular variation due to anthropogenic influence, ii) an interannual component of natural variability, and iii) a decadal component of natural variability. Natural variability is variability intrinsic to the climate system, and interannual and decadal are separated by a cutting period of 10 years. Consequently the variability due to the El Niño-Southern Oscillation is classified as interannual, while variability on timescales of 10 years or longer is classified as decadal. The user may define a season of interest and results display as a map of variance explained or standard deviation, and as time series at a given location. Data processing consists of linear regression in order to extract slow, trend-like changes and low-pass filtering (Butterworth), to separate high and low frequency components in the detrended data. Another version is available in the IFRC Map Room where the statistical values are categorized by degree of importance of variability to provide less technical tailored information for planning purposes on different timescales. It uses data from CMIP3 multi-model ensemble mean representing the secular variation due to anthropogenic influence and the scale decomposition is applied to monthly mean precipitation and temperature from CRU TS3.1.
The Flexible Forecast Map Room (Figure 6) consists of probabilistic temperature or precipitation seasonal forecasts based on the full estimate of the probability distribution, an extension to the more traditional three tercile forecast. Probabilistic seasonal forecasts from multi-model ensembles through the use of statistical calibration, and, based on the historical performance of those models, provide reliable information to a wide range of climate risk and decision making communities, as well as to the forecast community. The flexibility of the full probability distributions allows delivery of interactive maps and point-wise distributions that become relevant to user-determined needs, since probability of exceeding a user-defined historical percentile is actionable. It allows the users to tailor the forecast to real-world problems that may vary from malaria control planning to disaster risk management to hydropower management, to name just a few. It uses historical observations of monthly temperature from CAMS and precipitation from CMAP combined with IRI forecast data.
The Drought Map Room (Figure 7) was developed with funding from the NOAA Climate Test Bed and in collaboration with partners from the NOAA Earth System Research Laboratory and Climate Prediction Center (CPC) to produce quantifiable, probabilistic forecasts of drought over the U.S. and Mexico a few months in advance using the standardized precipitation index (SPI, (McKee et al. 1993) as an indicator of precipitation deficits. The maproom includes analyses of past and current drought using the CPC Unified and U.S. Climate Divisions precipitation datasets as observational inputs at SPI accumulation periods of 3, 6, 9, and 12 months. The user can view maps of the SPI analysis at each of the accumulation periods and click on the map to view time series of the SPI at the selected location over recent years, with the D0-D4 drought severity thresholds from the North American Drought Monitor indicated. Two methodologies are used to produce probabilistic drought forecast maps. The first forecast map tool uses an “optimal persistence” method (Lyon et al. 2012) based upon the correlation between SPI calculated for recently-observed precipitation and SPI calculated for a future month using recently-observed and historically-observed mean precipitation. The SPI Multi-Model Ensemble Forecast Tool builds upon the Persistence Tool, but uses forecast SPI values based upon precipitation from the IRI Multi-Model Ensemble (MME) at locations, starting months, and leads where hindcast correlation skill from the IRI MME improves over the Persistence method. Both tools display maps of the probability of SPI falling below a user-selected threshold and the forecast SPI in a future month for a user-selected marginal probability.
Analysis tools
The IRI Data Library is a framework that allows easy application of analysis filters to a wide variety of data. There are several important factors that distinguish it as an analysis framework.
-
1.
Data are organized into datasets comprised of sub-datasets and multi-dimensional variables with use metadata: these variables can be quite large (terabytes) with many dimensions, so that a single variable can conceptually unify what in practice may be many files spread across many directories, details the user can ignore (or be blissfully unaware).
-
2.
Analysis filters usually return variables (sometimes datasets), i.e. data with associated use metadata. This means filters can be chained together, any analysis result behaves as if it were a named dataset. In fact a number of variables named in the dataset collection are analyses based on other datasets.
-
3.
Specifying a calculation is separate from actually executing it, so that chains of calculations processing large amounts of data can be specified and manipulated while the actual execution of the data flow (or portions thereof) is delayed until it is actually required. This allows one to think in the abstract about manipulating the entire dataset, yet actually access it one portion at a time. It also allows shifting the responsibility of efficiently arranging the calculation away from the user, who can then focus on the actual scientific and statistical analysis.
This easy access to analysis filters is particularly useful in training. The Climate Information for Public Health course (Cibrelus and Mantilla 2010), for example, is intended to engage decision makers directly, not just through expert lectures, but also through focused discussions and practical training sessions. These sessions introduce the participants to geographical information system (GIS)-based computational tools for analyzing epidemiological data with climate, population and environmental data. To allow the students to focus on the course content and still be able to analyze their own data in the context of available climate information, we have built services that allowed them to access and analyze their own data within the Climate Data Library, as well as adding analysis functions particularly useful in health analyses, ranging from k-means clustering to disease epidemic threshold calculations. This course and its tools have been taught in an annual Summer Institute and in sessions around the world, some using the Standalone Data Library (see below).
Another example of advanced analysis using the IRI Data Library is to create spatial-temporal maps of malaria incidence using health surveillance data. Monthly data on clinical malaria cases from 242 health facilities in 58 subzobas (district boundaries from the National Statistics and Evaluation Office) in Eritrea from 1996 to 2003 were used in a novel stratification process to guide future interventions and development of an epidemic early warning system. The process used principal component analysis and nonhierarchical clustering to define five areas with distinct malaria intensity and seasonality patterns and has been used by the Eritrean Malaria Control program in its planning process (Ceccato et al. 2007b).
Semantic Technology
Often in a research community there are several different metadata standards used to describe the same object. Associated with each metadata standard is a conceptual model, frequently not explicit, which describes the object in its own way. We are using an RDF/XML (Resource Description Framework) framework to address this issue, and create a flexible, reusable solution that can adapt to a variety of new metadata standards. It implements a semantic framework for explicitly writing down multiple metadata schema and conceptual models as ontologies; the ontologies identify metadata elements and concepts and characterize the relationships between them. We also use the framework to write crosswalks, i.e. explicit characterizations of the relationships between concepts and metadata elements belonging to different systems, including the connections between the metadata objects and the concepts they represent. Not only does this framework allow translation between alternate systems, it also facilitates building a more complete description of data objects out of a number of narrowly-focused standard systems. Going beyond standards, it can explicitly describe the data models implicit in programs that display and manipulate data. Writing Models, Crosswalks, and Objects all with RDF/SemanticWeb means that these data models and metadata standards can be combined into a single framework, leading to an interoperable metadata standard (Blumenthal et al. 2011).
Crosswalking between different standards can be as simple as two different names for the same quantity, but sooner or later the mapping gets more complicated. Frequently, different objects are related conceptually but are very different structurally. Our framework thus has both structure and conceptual models. Structure models describe how dataset metadata is written, e.g. cfatt which describe the attributes of a Climate and Forecast Metadata (CF) Convention netcdf file. Conceptual models describe the conceptual objects represented in the convention, e.g. cf-obj which describes the more abstract objects (like geo-located data) that are being described in the CF convention, objects that are also described in other systems, but are not explicitly written in any given CF netcdf file. XML Schema is a common way to represent structure models for XML files, and we have a translation of XML Schema to RDF/OWL which allows us to create conforming XML files from RDF information. We have applied this to the WCS Schema, for example, to extract the needed information for an OPeNDAP WCS service based on RDF extracted from CF/netcdf files. We also have included controlled vocabularies such as CF standard names or GCMD scientific parameters. Controlled vocabularies are a common way to structure classifications, and important for us to build a faceted search that works across diverse datasets.
The framework is established by creating ontologies for each metadata representation of these objects, and rule‒based crosswalks between them so that each object is expressed in all representations, thus all objects can be viewed in multiple systems. This technology has been encapsulated in a Java based persistence/inferencing framework for OPeNDAP (Cornillon et al. 2009) as part of a NOAA/IOOS project (Holloway et al. 2010). This work combines custom innovations, the use of ontologies, and leading Semantic Web technologies, such as, Sesame and OWLIM. Because this framework was developed on Java technology, the system is highly portable between various platforms.
We also developed an XML element extraction system based on Java, which allows the extraction of information from the framework into an XML format that is based on data description and delivery standards (WMS, SERF, etc.). With these tools we can further develop technologies of delivering climate data and analysis to partner systems.