Katrina_Assignment_1

==== The data for this assignment is from the HIAPER Pole to Pole Observations (HIPPO) mission flown aboard the NCAR Gulfstream V during March/April, 2010. Presented below is the flight path covered during the 3 week mission. Data used for this project was the 1 Hz frequency samples the measure CO, CO2, CH4, N2O, O3 and many other compounds. Any times with missing data were removed for simplicity. Additionally the flights to and from Boulder, CO were not used. ====



Part 1: Create 1D Histogram and the Associated PDF

 * 1) Choose 1 variable that is interesting to you, and create a histogram of that variable.
 * ======What is the best number of bins, why? **The number of bins used was 50. This was not a calculated value but was what seemed to be the best representation of the data.**======




 * 1) Using the same variable create a PDF (normalized).
 * 1) Compute the first 4 moments using a loop over the histogram bins.
 * || Loop Calculated || Automatic ||
 * Mean || 388.67 || 388.67 ||
 * Variance || 13.247 || 13.244 ||
 * Skew || 0 || 0.32207 ||
 * Kurtosis || 2.1051 || -0.89743 ||

Part 2: Scatter Plot, 2D Histogram/PDF, covariance, and marginal distributions

 * 1) Choose 2 (or more) variables that are interesting to you. One can be the variable used in part 1. Create a scatter plot of the 2 variables against each other.
 * Try scatterhist, or marginhist.m for octave users (you may have to google and download it).
 * Some additional tools are at http://mpo581-hw2.wikispaces.com/Multivariate+display+tools
 * 1) Make a 2D histogram of the same data in the scatter plot of Part 2, Question 1.
 * Try hist2d
 * Play with the size and number of bins. What is best, and why?
 * 1) [[image:ks_2D_v2.jpg width="480" height="360"]][[image:ks_2D_v3.jpg width="480" height="360"]]
 * Normalize your 2D histogram so that it is a true joint PDF (as in Part 1 Question 2).
 * Label the colorbar or contours with appropriate units

cov(co,co2)=166.06
 * 1) Compute covariance from the 2D PDF using by looping over the PDF bins, as in Eq. 3.27 and Fig. 3.16, p53-54 in the book.
 * [[image:mpo581-assignment1/ks_2dpdf.jpg width="480" height="360"]]
 * Compare this to the covariance you get from the built-in function,cov, acting on the raw data

Part 3: Study a conditional sample
I chose to bound the data by limiting to samples that CO2 levels that were greater than 395. This should be a indicative of air that has been more recently impacted by pollution.
 * 1) Use find in Matlab or where in IDL to isolate one or more parts (subsets, or //conditional samples//) of one of the above distributions. Depict the boundaries of your conditional sample on the distribution plot that defines it. Perhaps you'll find a bimodal or multimodal distribution, so that when you slice it you are sampling truly distinct modes or 'regimes'. If not, it is still valid to explore the properties of arbitrarily defined quantiles of a marginal (1D) distribution or sectors of a joint (2D or higher) distribution. Actually, why not make averages of other variables in EVERY bin of a histogram, to make an Importance-weighted distribution (where those other variables define the Importance)?
 * 1) For one or more of your conditional samples (subsets of your data set), display some other variables that weren't part of defining the sample. For example, make histograms or averages (composites) over one or more conditional samples and contrast these with histograms or averages of other conditional samples, or of the full (unconditional) dataset.

Part 4: Scientific Interpretation
The histogram of CO2 shows that the most probable mixing ratios fall withing a narrow range. This leads one to believe that CO2 is well mixed in the atmosphere (i.e. there is little hemispheric difference). Additionally it is seen the the points that are farthest from the mean are greater. This is completely what is expected from a gas that is as well characterized as CO2 is. Since these samples were taken in March/April, the hemispheric difference will also be limited as both hemisphere are in a transitional state.
 * Give a scientific interpretation of at least 2 of the above results! **

As seen in Part 2, CO2 and CO co-vary strongly together. However, they variations are not strictly linear. The different branches seen in the scatterplot show different regimes of mixing or aging of air. CO reacts much more quickly in the atmosphere than CO2, meaning that the destruction of CO is on time scales that are shorter than inter-hemispheric mixing. These variations are widely studied and with the addition of other trace gases can be used to provide insight into the mixing time scales of the atmosphere and develop age of air models.

The high CO2 regimes should be indicative of air masses that have recently been in contact with the world's population centers. This leads us to make certain assumptions that can be evaluated from the other PDF's. The first thought is that high CO2 should be located in the northern hemisphere at lower altitudes. The latitude assumption is certainly true. All of the samples with CO2 concentrations greater than 395 are located from 30 to 80 degrees North with the likeliest latitude being 75 N. The altitude ranges for these samples go to approximately 6 km. Although the most likely altitudes are below 1.5 km, there are still a significant amount of the high CO2 samples that exist at higher altitudes. This means that there is significant vertical mixing bringing air masses with CO2 enhancements away from the surface and out of the boundary layer. The other thing to look at is if the high CO2 samples compare with other trace gases. These samples were also on the high end of spectrum for CO (not surprising given the co-variance results). The same is true for methane. The N2O values are on par with the averages (concentrations of N2O in the free troposphere vary between about 5 ppmv). The fact that the N2O values are not greatly enhanced would indicate that these air parcels with CO2 enhancements are not originating in the tropics. (The largest sources of N2O are production in the atmosphere by lightning and soils, especially in the deep tropics). Ozone was the other tracer considered. The ozone values are on the low end of the range meaning these are samples that have no stratospheric influence (not at all surprising since CO2 originates at the surface). For surface levels, the ozone values do indicate the influence of pollution since 80 ppbv is the threshold limit for 8 hr ozone define by the EPA. All of this information combined gives some ideas about where the high levels of CO2 are originating from: most likely population centers of North America and Asia and possibly biomass burning activities in Canada and Alaska.