Marcela_Assigment_1

In Dataset used: Daily Buoy Data RAMA array in the Indian Ocean during October 2011-January 2012.

@http://www.pmel.noaa.gov/tao/disdel/frames/main.html



Part 0.1: PDF units?
After the class I looked for more references about PDFs. I found Wilks definition on chapter IV of his book:



If we take Wilks' definition f(x) is a PDF only if the integral of f(x) over a domain =1. Therefore the units of f(x) should be 1/units(x). Then, The probability of x to have a value between [x1 x2] can be found by integrating f(x)dx and evaluating over x2-x1.

When we plot an histogram we don't have a continuous function f(x) instead we have discrete blocks of bins. Integrate would be the cumulative sum between [x1 x2] of the pdf. And the cumulative sum over the data's domain should be 1.

I also found this was consistent in Menke's book when they give the equation to calculate the median. They find it by integrating over the curve until the area reaches a 0.5 value:



Part 0.2: Should the PDF look the same for different bin widths?
ANSWER: Yes. I tried the SST data with 3 different bin widths=[0.05 0.1 0.2]. The PDFs obtained are not exatly the same but it ranges in the same order of magnitude.It makes sense since the 3 are approximations of the continuous PDF. And they are all normalized.

Here are the results: Each plot represents a different buoy. The location of the plots resembles the spatial location (lat,lon) of the buoy in the Indian Ocean. I plotted the PDF for the 3 cases for the 6 different buoys. The area under the curve for all cases equals 1. As the bins get finer and finer you can see more definition of the PDF. However, the three pdfs follow the same shape and have the same order of magnitude, this is due to the fact that we normalized by the total area under the curve for each specific curve.

As we decrease the dx the frequency values of the histogram decrease because now we have more classes (Even though we still have the same number of observations N). We normalize the values based on the area: to do this we have to divide by Ndx. But dx is smaller (because we have more clases), so the division will make the pdf value to increase again.


 * Figure 1. **

Part 1: Create 1D Histogram and the Associated PDF
**HISTOGRAMS:**

**- dT=0.05 -**
 * Figure 2.1 **



**- dT=0.1 -**
 * Figure 2.2 **



**- dT=0.2 -** **Associated PDF:**
 * Figure 2.3 **

**- dT=0.05 **
 * Figure 3.1 **



**- dT=0.1 -**
 * Figure 3.2 **

**- dT=0.2 -**
 * Figure 3.3 **



**Moments Calculations:**

**- dT=0.05 -**
 * Figure 4.1 **



**- dT=0.1 -**
 * Figure 4.2 **

**- dT=0.2 -**
 * Figure 4.3 **



As dT gets smaller the values calculated become closer to those calculated by matlab:
 * Table 1. **

The values of the Calculated mode and Matlab mode are very different. In their website Matlab recommends to obtain the mode by taking the maximum of the histogram or associated pdf. Instead of calculated directly from the dataset. Because the Matlab function is setup to work for discrete or coarsely rounded data: []

Part 2: Scatter Plot, 2D Histogram/PDF covariance and marginal distributions
Variables: SST and air temperature (3m above the buoy).

**Figure 2.1: Scatter Plot **

Notes: One Buoy doesn't have air temperature data. The color of the circles depend on the day of the data, following this colormap:


 * Figure 5 **




 * Figure 6 **








 * Figure 7: **SST and Rain Rate 2D Histogram




 * Figure 8: ** 2D Histogram Contours




 * Figure 9: **<span style="color: #800080; font-family: Verdana,Geneva,sans-serif;">PDF


 * Figure 10: **<span style="color: #800080; font-family: Verdana,Geneva,sans-serif;"> PDF Contours



**<span style="color: #0000ff; font-family: Verdana,Geneva,sans-serif;">Compute the Covariance from the PDF Eq. 3.27 **

COVARIANCE Calculated: 0.086 COVARIANCE Matlab: 0.091
 * BUOY: 1.5n90e**

COVARIANCE Calculated: 0.090 COVARIANCE Matlab: 0.092
 * BUOY: 0n80.5e**

COVARIANCE Calculated: 0.058 COVARIANCE Matlab: 0.057
 * BUOY: 0n90e**

COVARIANCE Calculated: 0.058 COVARIANCE Matlab: 0.057
 * BUOY: 1.5s80.5e**

COVARIANCE Calculated: 0.138 COVARIANCE Matlab: 0.141
 * BUOY: 1.5s90e**

<span style="color: #ff2e2e; font-family: Verdana,Geneva,sans-serif;">Part 3: Study a conditional Sample
<span style="color: #000000; font-family: Verdana,Geneva,sans-serif;">The data subset that I chose is based on the scatter plot of SST and Air Temperature. <span style="color: #000000; font-family: Verdana,Geneva,sans-serif;">The SST data has to be between [28.5 29.5] C and the Air Temperature data between [28 29] C. This area is shown in the plots below. It also coincides with the interval that contains the maximum frequencies in the 2D Histogram for the majority of the stations.


 * Figure 11: ** Scatter plots and selected subset


 * Figure 12: ** Scatter plots and frequency contours

Extra Variables: Wind Speed (m/s) Anomalies: Wind speed 4 meters above the sea surface. Dynamic Height (cm) Anomalies: Se surface elevation calculated base on the specific volume anomalies between surface and 500m. Sea Surface Salinity (SSS) (PSU) Anomalies: Sea surface salinity derived from conductivity. ([]). Daily Accumulated rain (cm) at 3.5m above the sea surface.
 * Subset Histograms **


 * Figure 13.1: Buoy 1.5N 90E **
 * Figure 13.2: Buoy 0N 80.5E **
 * Figure 13.3: Buoy 0N 90E **
 * Figure 13.4: Buoy 1.5S 80.5E **
 * Figure 13.5: Buoy 1.5S 90E **

Part 4: Scientific Interpretation
During the 4 month period, the different buoys present different distributions. Bouys 1.5N90E 0N90E and 1.5S80.5E have the same number of samples (123) and present the higher frequency values at the middle of the distribution. Buoy 1.5S90E also has 123 days sampled but in this case the distribution has 2 frequency maximums at a colder SSTs. For Buoys 1.5N90E, 0N90E, and, 1.5S80.5E higher air temperatures and SSTs occur during Mid-December through January. SST temperatures greater than 29.5(C) occur in lower frequency in the previous months with not as high air temperatures (See Figure 8).

Conditional Sample The subset chosen for the conditional sample are days that present the higher values for the SST and Air Temperatures (3m above the sea surface). This subset is the middle of the distribution and it includes the highest frequencies of the 2D distribution of SST and air temperatures. New variables were added to the analysis: wind speed anomalies, dynamic heights anomalies, sea surface anomalies and rain anomalies. All the anomalies are calculated based on the period mean. These variables were not present in all the buoys.

Buoy 1.5S90E presents a fairly symmetric SST histogram with a peak around 29.1 C. The air Temperature histogram as a well defined maximum as well but the distribution is asymmetric. For this case the wind speed values are below the average (3.85m/s for this case), and the dynamic height also presents values below the period's average (124.89cm). This suggests that during the events there is a tendency for the wind speed to decrease in the area as well as the sea surface height. The histogram of the sss over the area present 2 maximum values one close to -0.4 and another one close to 0 anomalies with respect to the total mean. This means that the salinity close to the buoy does not change for close to 15 days and it decreases ~0.4 PSU in another 15 days.This buoy does not have rain information, therefore we can't corroborate if the decrease in salinity is related to precipitation.

Buoy 0N80.5E presents a very clear maximum in the histogram. The air temperature values seen to be well distributed over the selected classes with a maximum close to 28.6 (C). This buoy also presents the decrease in winds speed and the decrease of the dynamic height values (as Buoy 1.5S90E). The salinity values are below the mean in general. No significant rain in the buoy region during the area. It is important to note that this buoy has only 55 days of data.

Buoy 0N90E does not show a defined maximum in the sst histogram as the previous buoys. As well as the other buoys it presents a decrease in the wind speed and dynamic height. The sss histogram show a tend to decreases, although the maximum frequencies have positive values. The rain histogram shows a very small to no rain during the period.

Buoy 1.5S80.5E and 1.5S90E presents the same characteristics as the other 3 buoys: decrease in wind speed and height and no precipitation. For 1.5S80.5E the salinity values are kept close to the mean for the majority of the days, with increases of sss up top 0.6 PSU for some cases.

The 6 buoys show wind speed and dynamic height decrease when the sea surface and air temperatures have high values. This is consistent if we think that less wind speed would yield to less convergence in the region (we will have to look closely to the wind direction) which is related with a decrese in the sea surface dynamic height and 0 precipitation values in the stations which have rain data. The sss changes are different between stations this could be due to a change in the currents close to the buoys.