Validate Python 'Heat Units' module¶

This notebook will use functions from the HeatUnits Python module to compute degree days for a test data set from the Degree Day Challenage. It will then compare the results with the reference answers from the UC IPM website. For more info on the datasets, see the Degree Day Challenge.

Summary of results - No differences were found between the computations and the reference answers for horizontal cutoffs. - A small number (4-10) of days out of 365 had relatively big differences between the UC IPM website and `HeatU()` for the intermediate and vertical cutoffs. This was seen in all methods (single-triangle, single-sine, double-triangle, double-sine) - Next steps: 1) repeat the calculations on the IPM website, 2) look for an alternative way (e.g., graphical) to compute the metric

First we import the Pandas package and the HeatUnits module:

In [1]:
import pandas as pd
import HeatUnits as hu

Next we import the reference temperature dataset:

In [2]:
# Read CSV into pandas dataframe
data = pd.read_csv('espartoa-weather-2020.csv')

# Preview
pd.reset_option('display.max_rows')
data
Out[2]:
station date tmin tmax
0 Esparto.A 2020-01-01 38 55
1 Esparto.A 2020-01-02 36 67
2 Esparto.A 2020-01-03 33 59
3 Esparto.A 2020-01-04 37 59
4 Esparto.A 2020-01-05 38 63
... ... ... ... ...
361 Esparto.A 2020-12-27 36 54
362 Esparto.A 2020-12-28 38 54
363 Esparto.A 2020-12-29 38 54
364 Esparto.A 2020-12-30 38 54
365 Esparto.A 2020-12-31 38 55

366 rows × 4 columns

Next, we add columns for the following day's min and max temperatures. Some of the degree day formula need these.

In [3]:
# Add column for yesterdays min temp and fill the NA values by repeating the prev/next row
data["tmin_tom"] = data["tmin"].shift(-1)
data.tmin_tom.fillna(data.tmin, inplace=True)

# add column for tomorrows max temp
data["tmax_yest"] = data["tmax"].shift(1)
data.tmax_yest.fillna(data.tmax, inplace=True)

# Preview
data
Out[3]:
station date tmin tmax tmin_tom tmax_yest
0 Esparto.A 2020-01-01 38 55 36.0 55.0
1 Esparto.A 2020-01-02 36 67 33.0 55.0
2 Esparto.A 2020-01-03 33 59 37.0 67.0
3 Esparto.A 2020-01-04 37 59 38.0 59.0
4 Esparto.A 2020-01-05 38 63 36.0 59.0
... ... ... ... ... ... ...
361 Esparto.A 2020-12-27 36 54 38.0 61.0
362 Esparto.A 2020-12-28 38 54 38.0 54.0
363 Esparto.A 2020-12-29 38 54 38.0 54.0
364 Esparto.A 2020-12-30 38 54 38.0 54.0
365 Esparto.A 2020-12-31 38 55 38.0 54.0

366 rows × 6 columns

The HeatUnits module has a single function HeatU that computes degree days from the daily minimum and maximum temperature using the methods described in Zalom (1983) and three different cutoff methods.

Let's look at the help page for HeatU:

In [4]:
help(hu.HeatU)
Help on function HeatU in module HeatUnits:

HeatU(lthres, cthres, cm, coff, ci, tmin, tmax, tmin_tomm=0, tmax_yest=0)

The arguments required include:

  • lthres - lower temperature threshold
  • cthres - upper temperature threshold
  • cm - computation method (1 = single sine; 2 = double sine; 3 = single triangle, 4 = double triangle; 5 = Huber's method)
  • coff - cuttoff method (1 = horizontal, 2 = intermediate, 3 = vertical)
  • ci - Computation interval (ASK SHANE WHAT IS THIS ARGUMENT FOR ?????)
  • tmin - minimum daily temp
  • tmax - maximum daily temp
  • tmin_tomm - tomorrow's daily minimum temp
  • tmax_yest - yesterday's daily maximum temp

Note that several of the arguments expect a temperature value. You can use either Farenheit or Celcius, as long as you're consistent.

To make our live a little easier, we define some global constants:

In [5]:
CM_SNGSIN = 1
CM_DBLSIN = 2
CM_SNGTRI = 3
CM_DBLTRI = 4
CM_HUBER = 5

COFF_H = 1
COFF_I = 2
COFF_V = 3                         

To demonstrate how to compute degree days with HeatU(), let's compute single sine method using a horizontal cutoff:

In [6]:
LOWER_THRESH = 50
UPPER_THRESH = 70

data_sisine = data.copy()
data_sisine["SiSineHor"] = data.apply(lambda x: hu.HeatU(LOWER_THRESH, UPPER_THRESH, CM_SNGSIN, COFF_H, 1, 
                                                  x['tmin'], x['tmax']), axis=1)

data_sisine
Out[6]:
station date tmin tmax tmin_tom tmax_yest SiSineHor
0 Esparto.A 2020-01-01 38 55 36.0 55.0 1.188424
1 Esparto.A 2020-01-02 36 67 33.0 55.0 5.706924
2 Esparto.A 2020-01-03 33 59 37.0 67.0 2.335503
3 Esparto.A 2020-01-04 37 59 38.0 59.0 2.559444
4 Esparto.A 2020-01-05 38 63 36.0 59.0 4.232057
... ... ... ... ... ... ... ...
361 Esparto.A 2020-12-27 36 54 38.0 61.0 0.819485
362 Esparto.A 2020-12-28 38 54 38.0 54.0 0.871982
363 Esparto.A 2020-12-29 38 54 38.0 54.0 0.871982
364 Esparto.A 2020-12-30 38 54 38.0 54.0 0.871982
365 Esparto.A 2020-12-31 38 55 38.0 54.0 1.188424

366 rows × 7 columns

Import Reference Answers¶

To compare these calculations against the reference answers, we first import the 'correct' answers available from the Degree Day Challenge website:

In [7]:
# Read the CSV of official answers into pandas dataframe
ans = pd.read_csv('ucipm_low50_high70_all.csv')

# Preview
ans
Out[7]:
date tmin tmax sngsine_horiz sngsine_vert sngsine_intrmd dblsine_horiz dblsine_vert dblsine_intrmd sngtri_horiz sngtri_vert sngtri_intrmd dbltri_horiz dbltri_vert dbltri_intrmd
0 1/1/2020 38 55 1.19 1.19 1.19 1.15 1.15 1.15 0.74 0.74 0.74 0.70 0.70 0.70
1 1/2/2020 36 67 5.71 5.71 5.71 5.56 5.56 5.56 4.66 4.66 4.66 4.46 4.46 4.46
2 1/3/2020 33 59 2.34 2.34 2.34 2.45 2.45 2.45 1.56 1.56 1.56 1.70 1.70 1.70
3 1/4/2020 37 59 2.56 2.56 2.56 2.59 2.59 2.59 1.84 1.84 1.84 1.88 1.88 1.88
4 1/5/2020 38 63 4.23 4.23 4.23 4.14 4.14 4.14 3.38 3.38 3.38 3.25 3.25 3.25
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
361 12/27/2020 36 54 0.82 0.82 0.82 0.85 0.85 0.85 0.44 0.44 0.44 0.47 0.47 0.47
362 12/28/2020 38 54 0.87 0.87 0.87 0.87 0.87 0.87 0.50 0.50 0.50 0.50 0.50 0.50
363 12/29/2020 38 54 0.87 0.87 0.87 0.87 0.87 0.87 0.50 0.50 0.50 0.50 0.50 0.50
364 12/30/2020 38 54 0.87 0.87 0.87 0.87 0.87 0.87 0.50 0.50 0.50 0.50 0.50 0.50
365 12/31/2020 38 55 1.19 1.19 1.19 1.19 1.19 1.19 0.74 0.74 0.74 0.74 0.74 0.74

366 rows × 15 columns

Compare Answers¶

To compare the values against reference answers, we i) round the values to two decimal places (because the reference values are rounded), and then subtract:

In [8]:
sngsine_diff = data_sisine.round({'SiSineHor': 2})['SiSineHor'] - ans['sngsine_horiz']

View the results (they should all be zero):

In [9]:
pd.set_option('display.max_rows', None)
display(sngsine_diff)
0      0.00
1      0.00
2      0.00
3      0.00
4      0.00
5      0.00
6      0.00
7      0.00
8      0.00
9      0.00
10     0.00
11     0.00
12     0.00
13     0.00
14     0.00
15     0.00
16     0.00
17     0.00
18     0.00
19     0.00
20     0.00
21     0.00
22     0.00
23     0.00
24     0.00
25     0.00
26     0.00
27     0.00
28     0.00
29     0.00
30     0.00
31     0.00
32     0.00
33     0.00
34     0.00
35     0.00
36     0.00
37     0.00
38     0.00
39     0.00
40     0.00
41     0.00
42     0.00
43     0.00
44     0.00
45     0.00
46     0.00
47     0.00
48     0.00
49     0.00
50     0.00
51     0.00
52     0.00
53     0.00
54    -0.01
55     0.00
56     0.00
57     0.00
58     0.00
59     0.00
60     0.00
61     0.00
62     0.00
63     0.00
64     0.00
65     0.00
66     0.00
67     0.00
68     0.00
69     0.00
70     0.00
71     0.00
72     0.00
73     0.00
74     0.00
75     0.00
76     0.00
77     0.00
78     0.00
79     0.00
80     0.00
81     0.00
82     0.00
83     0.00
84     0.00
85     0.00
86     0.00
87     0.00
88     0.00
89     0.00
90     0.00
91     0.00
92     0.00
93     0.00
94     0.00
95     0.00
96     0.00
97     0.00
98     0.00
99     0.00
100    0.00
101    0.00
102    0.00
103    0.00
104    0.00
105    0.00
106    0.00
107    0.00
108    0.00
109    0.00
110    0.00
111    0.00
112    0.00
113    0.00
114    0.00
115    0.00
116    0.00
117    0.00
118    0.00
119    0.00
120    0.00
121    0.00
122    0.00
123    0.00
124    0.00
125    0.00
126    0.00
127    0.00
128    0.00
129    0.00
130    0.00
131    0.00
132    0.00
133    0.00
134    0.00
135    0.00
136    0.00
137    0.00
138    0.00
139    0.00
140    0.00
141    0.00
142    0.00
143    0.00
144    0.00
145    0.00
146    0.00
147    0.00
148    0.00
149    0.00
150    0.00
151    0.00
152    0.00
153    0.00
154    0.00
155    0.00
156    0.00
157    0.00
158    0.00
159    0.00
160    0.00
161    0.00
162    0.00
163    0.00
164    0.00
165    0.00
166    0.00
167    0.00
168    0.00
169    0.00
170    0.00
171    0.00
172    0.00
173    0.00
174    0.00
175    0.00
176    0.00
177    0.00
178    0.00
179    0.00
180    0.00
181    0.00
182    0.00
183    0.00
184    0.00
185    0.00
186    0.00
187    0.00
188    0.00
189    0.00
190    0.00
191    0.00
192    0.00
193    0.00
194    0.00
195    0.00
196    0.00
197    0.00
198    0.00
199    0.00
200    0.00
201    0.00
202    0.00
203    0.00
204    0.00
205    0.00
206    0.00
207    0.00
208    0.00
209    0.00
210    0.00
211    0.00
212    0.00
213    0.00
214    0.00
215    0.00
216    0.00
217    0.00
218    0.00
219    0.00
220    0.00
221    0.00
222    0.00
223    0.00
224    0.00
225    0.00
226    0.00
227    0.00
228    0.00
229    0.00
230    0.00
231    0.00
232    0.00
233    0.00
234    0.00
235    0.00
236    0.00
237    0.00
238    0.00
239    0.00
240    0.00
241    0.00
242    0.00
243    0.00
244    0.00
245    0.00
246    0.00
247    0.00
248    0.00
249    0.00
250    0.00
251    0.00
252    0.00
253    0.00
254    0.00
255    0.00
256    0.00
257    0.00
258    0.00
259    0.00
260    0.00
261    0.00
262    0.00
263    0.00
264    0.00
265    0.00
266    0.00
267    0.00
268    0.00
269    0.00
270    0.00
271    0.00
272    0.00
273    0.00
274    0.00
275    0.00
276    0.00
277    0.00
278    0.00
279    0.00
280    0.00
281    0.00
282    0.00
283    0.00
284    0.00
285    0.00
286    0.00
287    0.00
288    0.00
289    0.00
290    0.00
291    0.00
292    0.00
293    0.00
294    0.00
295    0.00
296    0.00
297    0.00
298    0.00
299    0.00
300    0.00
301    0.00
302    0.00
303    0.00
304    0.00
305    0.00
306    0.00
307    0.00
308    0.00
309    0.00
310    0.00
311    0.00
312    0.00
313    0.00
314    0.00
315    0.00
316    0.00
317    0.00
318    0.00
319    0.00
320    0.00
321    0.00
322    0.00
323    0.00
324    0.00
325    0.00
326    0.00
327    0.00
328    0.00
329    0.00
330    0.00
331    0.00
332    0.00
333    0.00
334    0.00
335    0.00
336    0.00
337    0.00
338    0.00
339    0.00
340    0.00
341    0.00
342    0.00
343    0.00
344    0.00
345    0.00
346    0.00
347    0.00
348    0.00
349    0.00
350    0.00
351    0.00
352    0.00
353    0.00
354    0.00
355    0.00
356    0.00
357    0.00
358    0.00
359    0.00
360    0.00
361    0.00
362    0.00
363    0.00
364    0.00
365    0.00
dtype: float64

An easier way to view the differences is through a frequency table:

In [10]:
sngsine_diff.value_counts()
Out[10]:
 0.00    365
-0.01      1
dtype: int64

Compute and Compare All Degree Day Methods¶

Next we compute all the degree day methods, starting with the triangle methods:

In [11]:
data_sngtri = data.copy()
data_sngtri["SiTriHor"] = data_sngtri.apply(lambda x: hu.HeatU(50, 70, CM_SNGTRI, COFF_H, 1, x['tmin'], x['tmax']), axis=1)
data_sngtri["SiTriInt"] = data_sngtri.apply(lambda x: hu.HeatU(50, 70, CM_SNGTRI, COFF_I, 1, x['tmin'], x['tmax']), axis=1)
data_sngtri["SiTriVert"] = data_sngtri.apply(lambda x: hu.HeatU(50, 70, CM_SNGTRI, COFF_V, 1, x['tmin'], x['tmax']), axis=1)
data_sngtri_cols = ['station', 'date', 'tmin', 'tmax', 'tmin_tom', 'tmax_yest', 'SiTriHor', 'SiTriInt', 'SiTriVert']

# data_sngtri = data_sngtri[data_sngtri_cols]      ## works but causes an error below
data_sngtri = data_sngtri.loc[:,data_sngtri_cols]  
data_sngtri.head()
Out[11]:
station date tmin tmax tmin_tom tmax_yest SiTriHor SiTriInt SiTriVert
0 Esparto.A 2020-01-01 38 55 36.0 55.0 0.735294 0.735294 0.735294
1 Esparto.A 2020-01-02 36 67 33.0 55.0 4.661290 4.661290 4.661290
2 Esparto.A 2020-01-03 33 59 37.0 67.0 1.557692 1.557692 1.557692
3 Esparto.A 2020-01-04 37 59 38.0 59.0 1.840909 1.840909 1.840909
4 Esparto.A 2020-01-05 38 63 36.0 59.0 3.380000 3.380000 3.380000

Now compute the differences between the values computed by HeatU() and the answers from the IPM website:

In [12]:
data_sngtri["SiTriHor_diff"] = round(data_sngtri.SiTriHor, 2) - ans['sngtri_horiz']
data_sngtri["SiTriInt_diff"] = round(data_sngtri.SiTriInt, 2) - ans['sngtri_intrmd']
data_sngtri["SiTriVert_diff"] = round(data_sngtri.SiTriVert, 2) - ans['sngtri_vert']
data_sngtri.head()
Out[12]:
station date tmin tmax tmin_tom tmax_yest SiTriHor SiTriInt SiTriVert SiTriHor_diff SiTriInt_diff SiTriVert_diff
0 Esparto.A 2020-01-01 38 55 36.0 55.0 0.735294 0.735294 0.735294 0.0 0.0 0.0
1 Esparto.A 2020-01-02 36 67 33.0 55.0 4.661290 4.661290 4.661290 0.0 0.0 0.0
2 Esparto.A 2020-01-03 33 59 37.0 67.0 1.557692 1.557692 1.557692 0.0 0.0 0.0
3 Esparto.A 2020-01-04 37 59 38.0 59.0 1.840909 1.840909 1.840909 0.0 0.0 0.0
4 Esparto.A 2020-01-05 38 63 36.0 59.0 3.380000 3.380000 3.380000 0.0 0.0 0.0

Compare Differences: Single triangle with horizonal cutoff¶

In [13]:
# Single triangle with horizonal cutoff
data_sngtri.SiTriHor_diff.value_counts()
Out[13]:
 0.00    361
-0.01      1
-0.01      1
-0.01      1
-0.01      1
-0.01      1
Name: SiTriHor_diff, dtype: int64

Compare Differences: Single triangle with intermediate cutoff¶

In [14]:
# Single triangle with intermediate cutoff
data_sngtri.SiTriInt_diff.value_counts()
Out[14]:
 0.00     355
-0.01       4
-0.01       1
-0.01       1
 17.00      1
 16.50      1
 14.00      1
 13.50      1
-0.01       1
Name: SiTriInt_diff, dtype: int64

Oh dear. Let's review the rows with significant differences:

In [15]:
data_sngtri[data_sngtri.SiTriInt_diff >= 10]
Out[15]:
station date tmin tmax tmin_tom tmax_yest SiTriHor SiTriInt SiTriVert SiTriHor_diff SiTriInt_diff SiTriVert_diff
226 Esparto.A 2020-08-14 71 103 68.0 97.0 20.0 20.0 20.0 0.0 17.0 20.0
230 Esparto.A 2020-08-18 71 102 71.0 95.0 20.0 20.0 20.0 0.0 16.5 20.0
231 Esparto.A 2020-08-19 71 97 62.0 102.0 20.0 20.0 20.0 0.0 14.0 20.0
270 Esparto.A 2020-09-27 72 95 69.0 90.0 20.0 20.0 20.0 0.0 13.5 20.0

Compare Differences: Single triangle with vertical cutoff¶

In [16]:
# Single triangle with vertical cutoff
data_sngtri.SiTriVert_diff.value_counts()
Out[16]:
 0.00     358
 20.00      4
-0.01       2
-0.01       1
-0.01       1
Name: SiTriVert_diff, dtype: int64

Investigate Differences in the Single Triangle Methods¶

Oh dear, we see 4 days of the year where we're getting a big difference between the IPM website and the calculations from HeatU() for the single triangle method with the intermediate and vertical cutoffs (horizontal is fine).

Let's look at the rows where the difference is 20:

In [17]:
data_sngtri[data_sngtri.SiTriVert_diff >= 20]
Out[17]:
station date tmin tmax tmin_tom tmax_yest SiTriHor SiTriInt SiTriVert SiTriHor_diff SiTriInt_diff SiTriVert_diff
226 Esparto.A 2020-08-14 71 103 68.0 97.0 20.0 20.0 20.0 0.0 17.0 20.0
230 Esparto.A 2020-08-18 71 102 71.0 95.0 20.0 20.0 20.0 0.0 16.5 20.0
231 Esparto.A 2020-08-19 71 97 62.0 102.0 20.0 20.0 20.0 0.0 14.0 20.0
270 Esparto.A 2020-09-27 72 95 69.0 90.0 20.0 20.0 20.0 0.0 13.5 20.0

To see what's going on, let's join the data_sngtri and ans dataframes. They each have a 'date' column, however these columns are strings in different formats (see below for code to convert both of them to datetime64[ns]). So instead we'll merge them with Dataframe.join().

Join the Answer Table¶

Step 1 is to rename the columns in data_sngtri (to make it clear which dataframe they're from):

In [18]:
data_sngtri.rename(columns={'date': 'date_hu', 'tmin': 'tmin_hu', 'tmax': 'tmax_hu',
                           'SiTriHor': 'SiTriHor_hu', 'SiTriInt': 'SiTriInt_hu', 
                           'SiTriVert': 'SiTriVert_hu'}, inplace=True)
data_sngtri.head()
Out[18]:
station date_hu tmin_hu tmax_hu tmin_tom tmax_yest SiTriHor_hu SiTriInt_hu SiTriVert_hu SiTriHor_diff SiTriInt_diff SiTriVert_diff
0 Esparto.A 2020-01-01 38 55 36.0 55.0 0.735294 0.735294 0.735294 0.0 0.0 0.0
1 Esparto.A 2020-01-02 36 67 33.0 55.0 4.661290 4.661290 4.661290 0.0 0.0 0.0
2 Esparto.A 2020-01-03 33 59 37.0 67.0 1.557692 1.557692 1.557692 0.0 0.0 0.0
3 Esparto.A 2020-01-04 37 59 38.0 59.0 1.840909 1.840909 1.840909 0.0 0.0 0.0
4 Esparto.A 2020-01-05 38 63 36.0 59.0 3.380000 3.380000 3.380000 0.0 0.0 0.0

Next, create a subset of the columns of the Answers dataframe.

In [19]:
## Take a subset of columns from the reference dataset
ans_cols = ['date', 'tmin', 'tmax', 'sngtri_horiz', 'sngtri_intrmd', 'sngtri_vert']
ans_sngtri = ans.loc[:, ans_cols].copy()
ans_sngtri.head()
Out[19]:
date tmin tmax sngtri_horiz sngtri_intrmd sngtri_vert
0 1/1/2020 38 55 0.74 0.74 0.74
1 1/2/2020 36 67 4.66 4.66 4.66
2 1/3/2020 33 59 1.56 1.56 1.56
3 1/4/2020 37 59 1.84 1.84 1.84
4 1/5/2020 38 63 3.38 3.38 3.38

Rename them:

In [20]:
## Rename some of the columns
ans_sngtri.rename(columns={'date': 'date_ipm', 'tmin': 'tmin_ipm', 'tmax': 'tmax_ipm',
                           'sngtri_horiz': 'SiTriHor_ipm', 'sngtri_intrmd': 'SiTriInt_ipm', 
                           'sngtri_vert': 'SiTriVert_ipm'}, inplace=True)
ans_sngtri.head()
Out[20]:
date_ipm tmin_ipm tmax_ipm SiTriHor_ipm SiTriInt_ipm SiTriVert_ipm
0 1/1/2020 38 55 0.74 0.74 0.74
1 1/2/2020 36 67 4.66 4.66 4.66
2 1/3/2020 33 59 1.56 1.56 1.56
3 1/4/2020 37 59 1.84 1.84 1.84
4 1/5/2020 38 63 3.38 3.38 3.38

Before we join them, let's verify they have the same number of rows:

In [21]:
data_sngtri.shape
ans_sngtri.shape
Out[21]:
(366, 6)

Now we can join them:

In [22]:
sngtri_comb = data_sngtri.join(ans_sngtri, lsuffix='_hu', rsuffix='_ipm')
sngtri_comb.head()
Out[22]:
station date_hu tmin_hu tmax_hu tmin_tom tmax_yest SiTriHor_hu SiTriInt_hu SiTriVert_hu SiTriHor_diff SiTriInt_diff SiTriVert_diff date_ipm tmin_ipm tmax_ipm SiTriHor_ipm SiTriInt_ipm SiTriVert_ipm
0 Esparto.A 2020-01-01 38 55 36.0 55.0 0.735294 0.735294 0.735294 0.0 0.0 0.0 1/1/2020 38 55 0.74 0.74 0.74
1 Esparto.A 2020-01-02 36 67 33.0 55.0 4.661290 4.661290 4.661290 0.0 0.0 0.0 1/2/2020 36 67 4.66 4.66 4.66
2 Esparto.A 2020-01-03 33 59 37.0 67.0 1.557692 1.557692 1.557692 0.0 0.0 0.0 1/3/2020 33 59 1.56 1.56 1.56
3 Esparto.A 2020-01-04 37 59 38.0 59.0 1.840909 1.840909 1.840909 0.0 0.0 0.0 1/4/2020 37 59 1.84 1.84 1.84
4 Esparto.A 2020-01-05 38 63 36.0 59.0 3.380000 3.380000 3.380000 0.0 0.0 0.0 1/5/2020 38 63 3.38 3.38 3.38
In [23]:
sngtri_comb = sngtri_comb.reindex(columns=['station', 'date_hu', 'date_ipm', 'tmin_hu', 'tmin_ipm', 'tmax_hu',
                                           'tmax_ipm', 'tmin_tom', 'tmax_yest', 
                                           'SiTriHor_hu', 'SiTriHor_ipm', 'SiTriHor_diff',
                                           'SiTriInt_hu', 'SiTriInt_ipm', 'SiTriInt_diff', 
                                           'SiTriVert_hu', 'SiTriVert_ipm', 'SiTriVert_diff'])
sngtri_comb.head()
Out[23]:
station date_hu date_ipm tmin_hu tmin_ipm tmax_hu tmax_ipm tmin_tom tmax_yest SiTriHor_hu SiTriHor_ipm SiTriHor_diff SiTriInt_hu SiTriInt_ipm SiTriInt_diff SiTriVert_hu SiTriVert_ipm SiTriVert_diff
0 Esparto.A 2020-01-01 1/1/2020 38 38 55 55 36.0 55.0 0.735294 0.74 0.0 0.735294 0.74 0.0 0.735294 0.74 0.0
1 Esparto.A 2020-01-02 1/2/2020 36 36 67 67 33.0 55.0 4.661290 4.66 0.0 4.661290 4.66 0.0 4.661290 4.66 0.0
2 Esparto.A 2020-01-03 1/3/2020 33 33 59 59 37.0 67.0 1.557692 1.56 0.0 1.557692 1.56 0.0 1.557692 1.56 0.0
3 Esparto.A 2020-01-04 1/4/2020 37 37 59 59 38.0 59.0 1.840909 1.84 0.0 1.840909 1.84 0.0 1.840909 1.84 0.0
4 Esparto.A 2020-01-05 1/5/2020 38 38 63 63 36.0 59.0 3.380000 3.38 0.0 3.380000 3.38 0.0 3.380000 3.38 0.0

View the rows with the big differences:

In [24]:
sngtri_comb[sngtri_comb.SiTriVert_diff >= 20]
Out[24]:
station date_hu date_ipm tmin_hu tmin_ipm tmax_hu tmax_ipm tmin_tom tmax_yest SiTriHor_hu SiTriHor_ipm SiTriHor_diff SiTriInt_hu SiTriInt_ipm SiTriInt_diff SiTriVert_hu SiTriVert_ipm SiTriVert_diff
226 Esparto.A 2020-08-14 8/14/2020 71 71 103 103 68.0 97.0 20.0 20.0 0.0 20.0 3.0 17.0 20.0 0.0 20.0
230 Esparto.A 2020-08-18 8/18/2020 71 71 102 102 71.0 95.0 20.0 20.0 0.0 20.0 3.5 16.5 20.0 0.0 20.0
231 Esparto.A 2020-08-19 8/19/2020 71 71 97 97 62.0 102.0 20.0 20.0 0.0 20.0 6.0 14.0 20.0 0.0 20.0
270 Esparto.A 2020-09-27 9/27/2020 72 72 95 95 69.0 90.0 20.0 20.0 0.0 20.0 6.5 13.5 20.0 0.0 20.0

Single Sine Methods¶

In [25]:
data_sngsin = data.copy()
data_sngsin["SiSineHor"] = data_sngsin.apply(lambda x: hu.HeatU(50, 70, CM_SNGSIN, COFF_H, 1, x['tmin'], x['tmax']), axis=1)
data_sngsin["SiSineInt"] = data_sngsin.apply(lambda x: hu.HeatU(50, 70, CM_SNGSIN, COFF_I, 1, x['tmin'], x['tmax']), axis=1)
data_sngsin["SiSineVert"] = data_sngsin.apply(lambda x: hu.HeatU(50, 70, CM_SNGSIN, COFF_V, 1, x['tmin'], x['tmax']), axis=1)
data_sngsin.head()
Out[25]:
station date tmin tmax tmin_tom tmax_yest SiSineHor SiSineInt SiSineVert
0 Esparto.A 2020-01-01 38 55 36.0 55.0 1.188424 1.188424 1.188424
1 Esparto.A 2020-01-02 36 67 33.0 55.0 5.706924 5.706924 5.706924
2 Esparto.A 2020-01-03 33 59 37.0 67.0 2.335503 2.335503 2.335503
3 Esparto.A 2020-01-04 37 59 38.0 59.0 2.559444 2.559444 2.559444
4 Esparto.A 2020-01-05 38 63 36.0 59.0 4.232057 4.232057 4.232057

Compare with IPM answers:

In [26]:
ans.head()
Out[26]:
date tmin tmax sngsine_horiz sngsine_vert sngsine_intrmd dblsine_horiz dblsine_vert dblsine_intrmd sngtri_horiz sngtri_vert sngtri_intrmd dbltri_horiz dbltri_vert dbltri_intrmd
0 1/1/2020 38 55 1.19 1.19 1.19 1.15 1.15 1.15 0.74 0.74 0.74 0.70 0.70 0.70
1 1/2/2020 36 67 5.71 5.71 5.71 5.56 5.56 5.56 4.66 4.66 4.66 4.46 4.46 4.46
2 1/3/2020 33 59 2.34 2.34 2.34 2.45 2.45 2.45 1.56 1.56 1.56 1.70 1.70 1.70
3 1/4/2020 37 59 2.56 2.56 2.56 2.59 2.59 2.59 1.84 1.84 1.84 1.88 1.88 1.88
4 1/5/2020 38 63 4.23 4.23 4.23 4.14 4.14 4.14 3.38 3.38 3.38 3.25 3.25 3.25
In [27]:
data_sngsin.head()
Out[27]:
station date tmin tmax tmin_tom tmax_yest SiSineHor SiSineInt SiSineVert
0 Esparto.A 2020-01-01 38 55 36.0 55.0 1.188424 1.188424 1.188424
1 Esparto.A 2020-01-02 36 67 33.0 55.0 5.706924 5.706924 5.706924
2 Esparto.A 2020-01-03 33 59 37.0 67.0 2.335503 2.335503 2.335503
3 Esparto.A 2020-01-04 37 59 38.0 59.0 2.559444 2.559444 2.559444
4 Esparto.A 2020-01-05 38 63 36.0 59.0 4.232057 4.232057 4.232057
In [28]:
data_sngsin["SiSineHor_diff"] = round(data_sngsin.SiSineHor, 2) - ans['sngsine_horiz']
data_sngsin["SiSineInt_diff"] = round(data_sngsin.SiSineInt, 2) - ans['sngsine_intrmd']
data_sngsin["SiSineVert_diff"] = round(data_sngsin.SiSineVert, 2) - ans['sngsine_vert']
data_sngsin.head()
Out[28]:
station date tmin tmax tmin_tom tmax_yest SiSineHor SiSineInt SiSineVert SiSineHor_diff SiSineInt_diff SiSineVert_diff
0 Esparto.A 2020-01-01 38 55 36.0 55.0 1.188424 1.188424 1.188424 0.0 0.0 0.0
1 Esparto.A 2020-01-02 36 67 33.0 55.0 5.706924 5.706924 5.706924 0.0 0.0 0.0
2 Esparto.A 2020-01-03 33 59 37.0 67.0 2.335503 2.335503 2.335503 0.0 0.0 0.0
3 Esparto.A 2020-01-04 37 59 38.0 59.0 2.559444 2.559444 2.559444 0.0 0.0 0.0
4 Esparto.A 2020-01-05 38 63 36.0 59.0 4.232057 4.232057 4.232057 0.0 0.0 0.0

Look at the distribution of the differences:

In [29]:
data_sngsin.SiSineHor_diff.value_counts()
Out[29]:
 0.00    365
-0.01      1
Name: SiSineHor_diff, dtype: int64
In [30]:
data_sngsin.SiSineInt_diff.value_counts()
Out[30]:
0.0     362
17.0      1
16.5      1
14.0      1
13.5      1
Name: SiSineInt_diff, dtype: int64
In [31]:
data_sngsin.SiSineVert_diff.value_counts()
Out[31]:
0.0     362
20.0      4
Name: SiSineVert_diff, dtype: int64

See which rows are causing the issue:

In [32]:
data_sngsin[data_sngsin.SiSineVert_diff > 10]
Out[32]:
station date tmin tmax tmin_tom tmax_yest SiSineHor SiSineInt SiSineVert SiSineHor_diff SiSineInt_diff SiSineVert_diff
226 Esparto.A 2020-08-14 71 103 68.0 97.0 20.0 20.0 20.0 0.0 17.0 20.0
230 Esparto.A 2020-08-18 71 102 71.0 95.0 20.0 20.0 20.0 0.0 16.5 20.0
231 Esparto.A 2020-08-19 71 97 62.0 102.0 20.0 20.0 20.0 0.0 14.0 20.0
270 Esparto.A 2020-09-27 72 95 69.0 90.0 20.0 20.0 20.0 0.0 13.5 20.0

There seem to be 4 rows where the results of `HeatU()` for the Single Sine method with the intermediate and vertical cutoffs that differ from those from the IPM website

Double Sine Methods¶

In [33]:
data_dblsin = data.copy()
data_dblsin["DoSineHor"] = data_dblsin.apply(lambda x: hu.HeatU(50, 70, CM_DBLSIN, COFF_H, 1, x['tmin'], x['tmax'], x['tmin_tom'], x['tmax_yest']), axis=1)
data_dblsin["DoSineInt"] = data_dblsin.apply(lambda x: hu.HeatU(50, 70, CM_DBLSIN, COFF_I, 1, x['tmin'], x['tmax'], x['tmin_tom'], x['tmax_yest']), axis=1)
data_dblsin["DoSineVert"] = data_dblsin.apply(lambda x: hu.HeatU(50, 70, CM_DBLSIN, COFF_V, 1, x['tmin'], x['tmax'], x['tmin_tom'], x['tmax_yest']), axis=1)
data_dblsin.head()
Out[33]:
station date tmin tmax tmin_tom tmax_yest DoSineHor DoSineInt DoSineVert
0 Esparto.A 2020-01-01 38 55 36.0 55.0 1.154219 1.154219 1.154219
1 Esparto.A 2020-01-02 36 67 33.0 55.0 5.559096 5.559096 5.559096
2 Esparto.A 2020-01-03 33 59 37.0 67.0 2.447474 2.447474 2.447474
3 Esparto.A 2020-01-04 37 59 38.0 59.0 2.592931 2.592931 2.592931
4 Esparto.A 2020-01-05 38 63 36.0 59.0 4.141094 4.141094 4.141094

Compute differences with the IPM answers:

In [34]:
data_dblsin["DoSineHor_diff"] = round(data_dblsin.DoSineHor, 2) - ans['dblsine_horiz']
data_dblsin["DoSineInt_diff"] = round(data_dblsin.DoSineInt, 2) - ans['dblsine_intrmd']
data_dblsin["DoSineVert_diff"] = round(data_dblsin.DoSineVert, 2) - ans['dblsine_vert']
data_dblsin.head()
Out[34]:
station date tmin tmax tmin_tom tmax_yest DoSineHor DoSineInt DoSineVert DoSineHor_diff DoSineInt_diff DoSineVert_diff
0 Esparto.A 2020-01-01 38 55 36.0 55.0 1.154219 1.154219 1.154219 0.0 0.0 0.0
1 Esparto.A 2020-01-02 36 67 33.0 55.0 5.559096 5.559096 5.559096 0.0 0.0 0.0
2 Esparto.A 2020-01-03 33 59 37.0 67.0 2.447474 2.447474 2.447474 0.0 0.0 0.0
3 Esparto.A 2020-01-04 37 59 38.0 59.0 2.592931 2.592931 2.592931 0.0 0.0 0.0
4 Esparto.A 2020-01-05 38 63 36.0 59.0 4.141094 4.141094 4.141094 0.0 0.0 0.0
In [35]:
## Double sine horizontal cutoff
data_dblsin.DoSineHor_diff.value_counts()
Out[35]:
0.0    366
Name: DoSineHor_diff, dtype: int64
In [36]:
## Double sine intermediate cutoff
data_dblsin.DoSineInt_diff.value_counts()
Out[36]:
 0.00     356
-0.01       2
-0.01       1
 5.86       1
 7.88       1
 5.06       1
 16.50      1
 5.75       1
 3.32       1
 4.04       1
Name: DoSineInt_diff, dtype: int64
In [37]:
## Double sine vertical cutoff
data_dblsin.DoSineVert_diff.value_counts()
Out[37]:
0.0     359
10.0      6
20.0      1
Name: DoSineVert_diff, dtype: int64

Identify rows where they don't agree:

In [38]:
data_dblsin[data_dblsin.DoSineVert_diff > 1]
Out[38]:
station date tmin tmax tmin_tom tmax_yest DoSineHor DoSineInt DoSineVert DoSineHor_diff DoSineInt_diff DoSineVert_diff
225 Esparto.A 2020-08-13 60 97 71.0 97.0 18.864001 12.344432 12.344432 0.0 5.86 10.0
226 Esparto.A 2020-08-14 71 103 68.0 97.0 19.897955 11.434647 11.434647 0.0 7.88 10.0
229 Esparto.A 2020-08-17 59 95 71.0 96.0 18.665720 12.394310 12.394310 0.0 5.06 10.0
230 Esparto.A 2020-08-18 71 102 71.0 95.0 20.000000 20.000000 20.000000 0.0 16.50 20.0
231 Esparto.A 2020-08-19 71 97 62.0 102.0 19.168286 12.341712 12.341712 0.0 5.75 10.0
269 Esparto.A 2020-09-26 58 90 72.0 85.0 18.373521 12.569215 12.569215 0.0 3.32 10.0
270 Esparto.A 2020-09-27 72 95 69.0 90.0 19.958221 11.214880 11.214880 0.0 4.04 10.0
In [39]:
data_dblsin[data_dblsin.DoSineInt_diff > 1]
Out[39]:
station date tmin tmax tmin_tom tmax_yest DoSineHor DoSineInt DoSineVert DoSineHor_diff DoSineInt_diff DoSineVert_diff
225 Esparto.A 2020-08-13 60 97 71.0 97.0 18.864001 12.344432 12.344432 0.0 5.86 10.0
226 Esparto.A 2020-08-14 71 103 68.0 97.0 19.897955 11.434647 11.434647 0.0 7.88 10.0
229 Esparto.A 2020-08-17 59 95 71.0 96.0 18.665720 12.394310 12.394310 0.0 5.06 10.0
230 Esparto.A 2020-08-18 71 102 71.0 95.0 20.000000 20.000000 20.000000 0.0 16.50 20.0
231 Esparto.A 2020-08-19 71 97 62.0 102.0 19.168286 12.341712 12.341712 0.0 5.75 10.0
269 Esparto.A 2020-09-26 58 90 72.0 85.0 18.373521 12.569215 12.569215 0.0 3.32 10.0
270 Esparto.A 2020-09-27 72 95 69.0 90.0 19.958221 11.214880 11.214880 0.0 4.04 10.0

There seem to be 7 rows where the results of `HeatU()` for the Double Sine method with the intermediate and vertical cutoffs that differ from those from the IPM website

Double-Triangle Methods¶

In [40]:
data_dbltri = data.copy()
data_dbltri["DoTriHor"] = data_dbltri.apply(lambda x: hu.HeatU(50, 70, CM_DBLTRI, COFF_H, 1, x['tmin'], x['tmax'], x['tmin_tom'], x['tmax_yest']), axis=1)
data_dbltri["DoTriInt"] = data_dbltri.apply(lambda x: hu.HeatU(50, 70, CM_DBLTRI, COFF_I, 1, x['tmin'], x['tmax'], x['tmin_tom'], x['tmax_yest']), axis=1)
data_dbltri["DoTriVert"] = data_dbltri.apply(lambda x: hu.HeatU(50, 70, CM_DBLTRI, COFF_V, 1, x['tmin'], x['tmax'], x['tmin_tom'], x['tmax_yest']), axis=1)
data_dbltri.head()
Out[40]:
station date tmin tmax tmin_tom tmax_yest DoTriHor DoTriInt DoTriVert
0 Esparto.A 2020-01-01 38 55 36.0 55.0 0.696594 0.696594 0.696594
1 Esparto.A 2020-01-02 36 67 33.0 55.0 4.455645 4.455645 4.455645
2 Esparto.A 2020-01-03 33 59 37.0 67.0 1.699301 1.699301 1.699301
3 Esparto.A 2020-01-04 37 59 38.0 59.0 1.884740 1.884740 1.884740
4 Esparto.A 2020-01-05 38 63 36.0 59.0 3.254815 3.254815 3.254815

Compute differences with IPM website:

In [41]:
data_dbltri["DoTriHorr_diff"] = round(data_dbltri.DoTriHor, 2) - ans['dbltri_horiz']
data_dbltri["DoTriInt_diff"] = round(data_dbltri.DoTriInt, 2) - ans['dbltri_intrmd']
data_dbltri["DoTriVert_diff"] = round(data_dbltri.DoTriVert, 2) - ans['dbltri_vert']
data_dbltri.head()
Out[41]:
station date tmin tmax tmin_tom tmax_yest DoTriHor DoTriInt DoTriVert DoTriHorr_diff DoTriInt_diff DoTriVert_diff
0 Esparto.A 2020-01-01 38 55 36.0 55.0 0.696594 0.696594 0.696594 0.0 0.0 0.0
1 Esparto.A 2020-01-02 36 67 33.0 55.0 4.455645 4.455645 4.455645 0.0 0.0 0.0
2 Esparto.A 2020-01-03 33 59 37.0 67.0 1.699301 1.699301 1.699301 0.0 0.0 0.0
3 Esparto.A 2020-01-04 37 59 38.0 59.0 1.884740 1.884740 1.884740 0.0 0.0 0.0
4 Esparto.A 2020-01-05 38 63 36.0 59.0 3.254815 3.254815 3.254815 0.0 0.0 0.0

Look at the distribution of the differences:

In [42]:
## Double triangle horizontal cutoff
data_dbltri.DoTriHorr_diff.value_counts()
Out[42]:
 0.00    365
-0.01      1
Name: DoTriHorr_diff, dtype: int64
In [43]:
## Double triangle intermediate cutoff
data_dbltri.DoTriInt_diff.value_counts()
Out[43]:
 0.00     358
-0.01       1
 4.63       1
 6.85       1
 3.90       1
 16.50      1
 4.49       1
 2.37       1
 3.14       1
Name: DoTriInt_diff, dtype: int64
In [44]:
## Double triange vertical cutoff
data_dbltri.DoTriVert_diff.value_counts()
Out[44]:
 0.00     357
 10.00      4
-0.01       2
 9.99       2
 20.00      1
Name: DoTriVert_diff, dtype: int64

Investigate rows with the differences occur:

In [45]:
data_dbltri[data_dbltri.DoTriVert_diff > 1]
Out[45]:
station date tmin tmax tmin_tom tmax_yest DoTriHor DoTriInt DoTriVert DoTriHorr_diff DoTriInt_diff DoTriVert_diff
225 Esparto.A 2020-08-13 60 97 71.0 97.0 19.324324 12.027027 12.027027 0.0 4.63 10.00
226 Esparto.A 2020-08-14 71 103 68.0 97.0 19.971429 10.542857 10.542857 0.0 6.85 10.00
229 Esparto.A 2020-08-17 59 95 71.0 96.0 19.159722 12.215278 12.215278 0.0 3.90 10.00
230 Esparto.A 2020-08-18 71 102 71.0 95.0 20.000000 20.000000 20.000000 0.0 16.50 20.00
231 Esparto.A 2020-08-19 71 97 62.0 102.0 19.542857 11.828571 11.828571 0.0 4.49 10.00
269 Esparto.A 2020-09-26 58 90 72.0 85.0 18.875000 12.625000 12.625000 0.0 2.37 9.99
270 Esparto.A 2020-09-27 72 95 69.0 90.0 19.990385 10.375000 10.375000 0.0 3.14 9.99
In [46]:
data_dbltri[data_dbltri.DoTriInt_diff > 1]
Out[46]:
station date tmin tmax tmin_tom tmax_yest DoTriHor DoTriInt DoTriVert DoTriHorr_diff DoTriInt_diff DoTriVert_diff
225 Esparto.A 2020-08-13 60 97 71.0 97.0 19.324324 12.027027 12.027027 0.0 4.63 10.00
226 Esparto.A 2020-08-14 71 103 68.0 97.0 19.971429 10.542857 10.542857 0.0 6.85 10.00
229 Esparto.A 2020-08-17 59 95 71.0 96.0 19.159722 12.215278 12.215278 0.0 3.90 10.00
230 Esparto.A 2020-08-18 71 102 71.0 95.0 20.000000 20.000000 20.000000 0.0 16.50 20.00
231 Esparto.A 2020-08-19 71 97 62.0 102.0 19.542857 11.828571 11.828571 0.0 4.49 10.00
269 Esparto.A 2020-09-26 58 90 72.0 85.0 18.875000 12.625000 12.625000 0.0 2.37 9.99
270 Esparto.A 2020-09-27 72 95 69.0 90.0 19.990385 10.375000 10.375000 0.0 3.14 9.99

There seem to be 7 rows where the results of `HeatU()` for the Double Triangle method with the intermediate and vertical cutoffs that differ from those from the IPM website