AUS2200 xarray rolling mean¶

Aim¶

This recipe shows how to:

Load a month of AUS2200 data (entire spatial domain) with xarray (dask-enabled) and chunk the data for efficient computing
Calculate air temperature perturbations to the daily mean (using xarray rolling mean)

First load some python modules¶

Requires access to the xp65 conda environment

[1]:

import pandas as pd
import xarray as xr
import datetime as dt
import numpy as np
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import dask.array as da
import dask
from dask.distributed import Client
from dask import delayed

Compute size¶

We are working with XX-Large resources on ARE (28 cores, 126 GB)

[2]:

#Set up a dask distributed client, so that chunks of data can be sent and
# processed by different cores/"workers".

# Click "Launch dashboard in JupyterLab" within this cell's output to see dask progress

client = Client(threads_per_worker=1)
client

[2]:

Client

Client-58eb0d06-5004-11f1-a5e7-00000082fe80

Connection method: Cluster object	Cluster type: distributed.LocalCluster
Dashboard: /proxy/8787/status

Cluster Info

LocalCluster

85196910

Dashboard: /proxy/8787/status	Workers: 48
Total threads: 48	Total memory: 188.56 GiB
Status: running	Using processes: True

Scheduler Info

Scheduler

Scheduler-c03de054-0562-4fe9-9cd2-9d88ec29df50

Comm: tcp://127.0.0.1:37629	Workers: 0
Dashboard: /proxy/8787/status	Total threads: 0
Started: Just now	Total memory: 0 B

Workers

Worker: 0

Comm: tcp://127.0.0.1:37811	Total threads: 1
Dashboard: /proxy/44067/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:42837
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-xn0o4gjn

Worker: 1

Comm: tcp://127.0.0.1:42287	Total threads: 1
Dashboard: /proxy/36619/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:42879
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-_nslk182

Worker: 2

Comm: tcp://127.0.0.1:43743	Total threads: 1
Dashboard: /proxy/40245/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:32893
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-fjsnmx2p

Worker: 3

Comm: tcp://127.0.0.1:46001	Total threads: 1
Dashboard: /proxy/33717/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41227
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-3svtp_5a

Worker: 4

Comm: tcp://127.0.0.1:35477	Total threads: 1
Dashboard: /proxy/38565/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:39815
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-crksyzsy

Worker: 5

Comm: tcp://127.0.0.1:34487	Total threads: 1
Dashboard: /proxy/35249/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41197
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-arywgw2c

Worker: 6

Comm: tcp://127.0.0.1:33023	Total threads: 1
Dashboard: /proxy/43529/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:34009
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-bx5keo7w

Worker: 7

Comm: tcp://127.0.0.1:37381	Total threads: 1
Dashboard: /proxy/43627/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:46417
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-_2axp5dh

Worker: 8

Comm: tcp://127.0.0.1:43969	Total threads: 1
Dashboard: /proxy/41681/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:46757
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-y4ydv9uw

Worker: 9

Comm: tcp://127.0.0.1:34867	Total threads: 1
Dashboard: /proxy/37765/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:33589
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-yw4c8lf7

Worker: 10

Comm: tcp://127.0.0.1:35235	Total threads: 1
Dashboard: /proxy/43401/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41213
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-pfd6htk1

Worker: 11

Comm: tcp://127.0.0.1:37227	Total threads: 1
Dashboard: /proxy/41357/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:40263
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-tb70qhxq

Worker: 12

Comm: tcp://127.0.0.1:34455	Total threads: 1
Dashboard: /proxy/33479/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:36111
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-kl7_6wn6

Worker: 13

Comm: tcp://127.0.0.1:32769	Total threads: 1
Dashboard: /proxy/33597/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:44285
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-pvli1thy

Worker: 14

Comm: tcp://127.0.0.1:36969	Total threads: 1
Dashboard: /proxy/43511/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:38633
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-l7icfpp0

Worker: 15

Comm: tcp://127.0.0.1:37131	Total threads: 1
Dashboard: /proxy/34879/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:42577
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-0pw7hty4

Worker: 16

Comm: tcp://127.0.0.1:40889	Total threads: 1
Dashboard: /proxy/35213/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41923
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-vyc3fd1m

Worker: 17

Comm: tcp://127.0.0.1:41183	Total threads: 1
Dashboard: /proxy/39983/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:37305
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-jafrq7j6

Worker: 18

Comm: tcp://127.0.0.1:46453	Total threads: 1
Dashboard: /proxy/34691/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:46009
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-tn4g4wbk

Worker: 19

Comm: tcp://127.0.0.1:46457	Total threads: 1
Dashboard: /proxy/41817/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:38695
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-29wyb50z

Worker: 20

Comm: tcp://127.0.0.1:40607	Total threads: 1
Dashboard: /proxy/37985/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:42865
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-s97qjyxo

Worker: 21

Comm: tcp://127.0.0.1:34965	Total threads: 1
Dashboard: /proxy/45025/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:40927
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-zwft8cwe

Worker: 22

Comm: tcp://127.0.0.1:39485	Total threads: 1
Dashboard: /proxy/35583/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:33179
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-m9a4jk71

Worker: 23

Comm: tcp://127.0.0.1:39265	Total threads: 1
Dashboard: /proxy/42281/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:33405
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-nqvhc2c0

Worker: 24

Comm: tcp://127.0.0.1:42371	Total threads: 1
Dashboard: /proxy/33599/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:45667
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-q7bna5_u

Worker: 25

Comm: tcp://127.0.0.1:39617	Total threads: 1
Dashboard: /proxy/33903/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:36611
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-90hbcitf

Worker: 26

Comm: tcp://127.0.0.1:38763	Total threads: 1
Dashboard: /proxy/44497/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:38661
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-75ed24cr

Worker: 27

Comm: tcp://127.0.0.1:38789	Total threads: 1
Dashboard: /proxy/37989/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:44845
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-cxgjshi7

Worker: 28

Comm: tcp://127.0.0.1:35045	Total threads: 1
Dashboard: /proxy/39913/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:43269
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-fsvwnfc0

Worker: 29

Comm: tcp://127.0.0.1:33155	Total threads: 1
Dashboard: /proxy/43665/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41617
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-xr18_kwx

Worker: 30

Comm: tcp://127.0.0.1:45593	Total threads: 1
Dashboard: /proxy/39357/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:42927
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-gldofaqt

Worker: 31

Comm: tcp://127.0.0.1:38259	Total threads: 1
Dashboard: /proxy/33043/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:38719
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-5nukvxmv

Worker: 32

Comm: tcp://127.0.0.1:38391	Total threads: 1
Dashboard: /proxy/35377/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:37543
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-ewhmyd1j

Worker: 33

Comm: tcp://127.0.0.1:33407	Total threads: 1
Dashboard: /proxy/39345/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:45185
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-dh_6qycb

Worker: 34

Comm: tcp://127.0.0.1:38067	Total threads: 1
Dashboard: /proxy/32785/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:33713
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-l6qp80ad

Worker: 35

Comm: tcp://127.0.0.1:34961	Total threads: 1
Dashboard: /proxy/34907/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:43153
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-bnv8fu9e

Worker: 36

Comm: tcp://127.0.0.1:42833	Total threads: 1
Dashboard: /proxy/34937/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41433
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-zhhcx0vt

Worker: 37

Comm: tcp://127.0.0.1:39619	Total threads: 1
Dashboard: /proxy/44647/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:34751
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-zdqqmln5

Worker: 38

Comm: tcp://127.0.0.1:42059	Total threads: 1
Dashboard: /proxy/35855/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:32899
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-nw_c_h8n

Worker: 39

Comm: tcp://127.0.0.1:33981	Total threads: 1
Dashboard: /proxy/46305/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:43735
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-isq44lch

Worker: 40

Comm: tcp://127.0.0.1:39529	Total threads: 1
Dashboard: /proxy/46429/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:36195
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-xtkbiz2p

Worker: 41

Comm: tcp://127.0.0.1:38521	Total threads: 1
Dashboard: /proxy/32923/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:32863
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-q5gwptwp

Worker: 42

Comm: tcp://127.0.0.1:43779	Total threads: 1
Dashboard: /proxy/43213/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:43649
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-t8l1qabv

Worker: 43

Comm: tcp://127.0.0.1:38497	Total threads: 1
Dashboard: /proxy/36499/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:38121
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-5rnfhebc

Worker: 44

Comm: tcp://127.0.0.1:42825	Total threads: 1
Dashboard: /proxy/33419/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:38657
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-ihpd2k1c

Worker: 45

Comm: tcp://127.0.0.1:43077	Total threads: 1
Dashboard: /proxy/41253/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:41005
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-ura0kru_

Worker: 46

Comm: tcp://127.0.0.1:37103	Total threads: 1
Dashboard: /proxy/34803/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:43211
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-bnxva4nd

Worker: 47

Comm: tcp://127.0.0.1:45821	Total threads: 1
Dashboard: /proxy/43691/status	Memory: 3.93 GiB
Nanny: tcp://127.0.0.1:36539
Local directory: /jobfs/168368305.gadi-pbs/dask-scratch-space/worker-6r_sp2p3

Load in the AUS2200 data¶

We use intake to do this: check out https://access-nri.github.io/interactive-data-catalogue/#/ to explore available data

[3]:

import intake
catalog = intake.cat.access_nri

esm_datastore = catalog['AUS2200']

esm_datastore = esm_datastore.search(variable_id='ta', experiment_id='mjo-elnino2016', frequency='1hr')
esm_datastore

/jobfs/168368305.gadi-pbs/ipykernel_1222119/2851339479.py:6: UserWarning: Value aliasing: variable_id='ta' → variable_id=['('fld_s30i204',)','ta']
  esm_datastore = esm_datastore.search(variable_id='ta', experiment_id='mjo-elnino2016', frequency='1hr')

AUS2200 catalog with 1 dataset(s) from 244 asset(s):

	unique
path	244
file_type	1
realm	1
model_id	1
experiment_id	1
frequency	1
variable_id	1
version	1
time_range	244
derived_variable_id	0

[4]:

#Define lat lon slices, equivalent to almost the entire AUS2200 domain
lon_slice = slice(108, 159)
lat_slice = slice(-45.7, -6.831799)

# Single model level (111.7 m)
lev_slice = slice(100, 120)

#Define times to slice
start_time="2016-01-01 00:00"
end_time="2016-02-01 00:00"

[5]:

aus2200_ta = esm_datastore.to_dask().sel(
    time = slice(start_time, end_time)
).sel(
    lon=lon_slice, lat=lat_slice, lev = lev_slice
).ta

[6]:

aus2200_ta

[6]:

<xarray.DataArray 'ta' (time: 744, lev: 1, lat: 1963, lon: 2575)> Size: 15GB
dask.array<getitem, shape=(744, 1, 1963, 2575), dtype=float32, chunksize=(1, 1, 424, 520), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 6kB 2016-01-01T01:00:00 ... 2016-02-01
  * lev      (lev) float64 8B 111.7
  * lat      (lat) float64 16kB -45.7 -45.68 -45.66 ... -6.891 -6.871 -6.852
  * lon      (lon) float64 21kB 108.0 108.0 108.1 108.1 ... 158.9 159.0 159.0
Attributes:
    standard_name:          air_temperature
    long_name:              Air Temperature
    comment:                Air Temperature
    units:                  K
    cell_methods:           area: mean time: point
    cell_measures:          area: areacella
    coverage_content_type:  modelResult

Chunks¶

Advanced (but necessary) topic¶

Previous versions of this notebook we specified the chunks as {"time":6,"lat":-1,"lon":-1,"lev":{}}. For lat and lon, "-1" means that the chunk sizes in those dimensions (1963 for lat, 2574 for lon) are equivalent to the length of the dimensions. In other words, the dataset is not chunked up in those dimensions.

However, the dataset is chunked along time (with a chunk size of 6). This is the best we can do for AUS2200 as each file is 6 time steps long. We can rechunk the time dimension later by calling aus2200_ta.chunk({"time":chunksize}), but this can be very slow.

Our aim is to have small enough chunks to fit on memory, but large enough chunks to reduce the time taken to pass data between workers and to reduce the number of operations dask is doing. The chunk size here (115 MB) is okay, with around 200 MB being a pretty good aim (although there is no standard rules around what chunk sizes are optimal, it takes some experimenting)

TLDR;¶

Advances in the xarray/dask ecosystem means that you can now typically just specify chunks="auto" to have dask figure out (close to) optimal chunks for you. Intake takes care of this for us.

Rolling daily mean¶

Now we’d like to compute a rolling daily average temperature. Rolling operations can be very slow, because for each point we need to access neighbouring time chunks

[7]:

time_window = 24      #equivalent to one day for the hourly data here
min_periods = 12      #for each time step, there must be at least 12 hours in the moving window
                      # for the rolling mean to be defined.
aus2200_ta_daily_mean = aus2200_ta.rolling(
    dim={"time":24},center=True,min_periods=12
).mean()

[8]:

aus2200_ta_daily_mean

[8]:

<xarray.DataArray 'ta' (time: 744, lev: 1, lat: 1963, lon: 2575)> Size: 15GB
dask.array<getitem, shape=(744, 1, 1963, 2575), dtype=float32, chunksize=(42, 1, 424, 520), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 6kB 2016-01-01T01:00:00 ... 2016-02-01
  * lev      (lev) float64 8B 111.7
  * lat      (lat) float64 16kB -45.7 -45.68 -45.66 ... -6.891 -6.871 -6.852
  * lon      (lon) float64 21kB 108.0 108.0 108.1 108.1 ... 158.9 159.0 159.0
Attributes:
    standard_name:          air_temperature
    long_name:              Air Temperature
    comment:                Air Temperature
    units:                  K
    cell_methods:           area: mean time: point
    cell_measures:          area: areacella
    coverage_content_type:  modelResult

Computing¶

Note that dask hasn’t done anything yet because we haven’t actually needed to access any data (with only metadata shown above so far). When we start making plots in the following cells, then dask will actually start doing computations and loading the required data into memory.

If for some reason we would like all the data in memory to access it, we can use compute() or persist() commands, such as

aus2200_ta_daily_mean = aus2200_ta_daily_mean.persist()

Plotting¶

Lets plot a single time step and start some computations. Note that the daily mean here smooths out small variations compared with the original temperature field

[9]:

#First for the original air temperature data
ax=plt.axes(projection=ccrs.PlateCarree())
aus2200_ta.sel(time="2016-01-22 08:00").plot()
ax.coastlines()

[9]:

<cartopy.mpl.feature_artist.FeatureArtist at 0x15115ccff830>

../_images/Recipes_AUS2200_xarray_rolling_16_1.png

[10]:

#And for the rolling daily mean
ax=plt.axes(projection=ccrs.PlateCarree())
aus2200_ta_daily_mean.sel(time="2016-01-22 08:00").plot()
ax.coastlines()

/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/client.py:3387: UserWarning: Sending large graph of size 10.43 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(

[10]:

<cartopy.mpl.feature_artist.FeatureArtist at 0x1512247495e0>

../_images/Recipes_AUS2200_xarray_rolling_17_2.png

Perturbations¶

Now calculate the temperature perturbations relative to the daily mean. Perturbations will re-introduce and highlight small-scale factors like convective cold pools and sea breezes along the coast, as well as allowing us to quantify the diurnal cycle.

[11]:

aus2200_ta_daily_pert = aus2200_ta - aus2200_ta_daily_mean

Plotting¶

As above, but for temperatue perturbations. Also plot for a single lat/lon location for the entire month

Note that the time series computation takes a lot longer, because dask needs to access many more files on disk (AUS2200 data is saved in 6-hourly files as discussed earlier)

[12]:

ax=plt.axes(projection=ccrs.PlateCarree())
aus2200_ta_daily_pert.sel(time="2016-01-22 08:00").plot()
ax.coastlines()

/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/client.py:3387: UserWarning: Sending large graph of size 12.65 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(

[12]:

<cartopy.mpl.feature_artist.FeatureArtist at 0x1512271114f0>

../_images/Recipes_AUS2200_xarray_rolling_21_2.png

[13]:

# This does not work: see https://github.com/dask/dask/issues/12198
try:
    aus2200_ta_daily_pert = aus2200_ta_daily_pert.sel(lat=-31.9275, lon=115.9764, method="nearest").plot()
except ValueError:
    print("Failed due to dask/bottleneck issue... see below for fix")

/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/client.py:3387: UserWarning: Sending large graph of size 12.77 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
2026-05-15 12:21:50,962 - distributed.worker - ERROR - Compute Failed
Key:       ('getitem-overlap-_trim-9debcf9e98eb529079aaf1b071f41466', 0, 0, 2, 0)
State:     executing
Task:  <Task ('getitem-overlap-_trim-9debcf9e98eb529079aaf1b071f41466', 0, 0, 2, 0) _execute_subgraph(...)>
Exception: "ValueError('Moving window (=24) must between 1 and 23, inclusive')"
Traceback: ''

Failed due to dask/bottleneck issue... see below for fix

Because of an annoying bug in dask >= 2024.11.0, we have to do a bit of a workaround here.¶

We’re going to select our point, and then instantiate the whole array in memory, so that we don’t touch dask, instead delegating the work to numpy.
This is generally inadvisable - it’s okay for a single point though, because we probably don’t have that much data
If you try to do this with a large 3/4D array, you will probably run out of memory.

See https://github.com/dask/dask/issues/12198 for the bug - at some point in the future,the above cell might just work.

The location we’re trying to plot is near Perth, so we’ll jsut call it that.

[14]:

perth_ta =  aus2200_ta.sel(lat=-31.9275, lon=115.9764, method="nearest")
perth_ta

[14]:

<xarray.DataArray 'ta' (time: 744, lev: 1)> Size: 3kB
dask.array<getitem, shape=(744, 1), dtype=float32, chunksize=(1, 1), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 6kB 2016-01-01T01:00:00 ... 2016-02-01
  * lev      (lev) float64 8B 111.7
    lat      float64 8B -31.92
    lon      float64 8B 116.0
Attributes:
    standard_name:          air_temperature
    long_name:              Air Temperature
    comment:                Air Temperature
    units:                  K
    cell_methods:           area: mean time: point
    cell_measures:          area: areacella
    coverage_content_type:  modelResult

[15]:

# This array is only 3KB - so we can load it all into memory
perth_ta = perth_ta.compute()

[16]:

# And now we can redo the rolling mean without any issues
perth_ta_daily_mean = perth_ta.rolling(
    dim={"time":24},center=True,min_periods=12
).mean()

perth_ta_daily_pert = perth_ta - perth_ta_daily_mean

[17]:

perth_ta_daily_pert.plot()

[17]:

[<matplotlib.lines.Line2D at 0x1511373b2ff0>]

../_images/Recipes_AUS2200_xarray_rolling_27_1.png

Andrew Brown

ARC Centre of Excellence for 21st Century Weather, University of Melbourne
Samuel Green

ARC Centre of Excellence for 21st Century Weather & Climate Change Research Centre, UNSW Sydney
Charles Turner

ACCESS-NRI, Australian National University, Canberra

If you have any enquries, suggested improvements or bug reports related to this recipe, please open an issue or start a discussion in this repository.