The USgas series is a ts object with 1 variable and 238 observations
Frequency: 12
Start time: 2000 1
End time: 2019 10
ME607 - Séries Temporais
Instituto de Matemática, Estatística e Computação Científica (IMECC),
Universidade Estadual de Campinas (UNICAMP).
Período
Unidade regular e repetitiva de tempo que divide a série temporal em subconjuntos consecutivos e igualmente longos (por exemplo, para séries mensais, um período completo seria um ano).
Frequência
É o tamanho ou número de unidades do período da série. Por exemplo:
Padrão sazonal único: quando existe apenas um único padrão sazonal.
Padrão sazonal múltiple: quando existe mais do que um único padrão sazonal
Consumo de energia elétrica
Demanda de transporte público
Vendas de varejo
Dica de ouro
Estatisticas descritivas para identificar padrões sazonais
Além do gráfico da série temporal, calcular medidas resumo para cada unidade de frequência da série ajuda na identificação. Por exemplo, se a frequência for 12 (dados mensais), calcular a média e desvio padrão por mês.
Jan Feb Mar Apr May Jun Jul Aug Sep Oct
2000 2510.5 2330.7 2050.6 1783.3 1632.9 1513.1 1525.6 1653.1 1475.0 1567.8
2001 2677.0 2309.5 2246.6 1807.2 1522.4 1444.4 1598.1 1669.2 1494.1 1649.1
2002 2487.6 2242.4 2258.4 1881.0 1611.5 1591.4 1748.4 1725.7 1542.2 1645.9
2003 2700.5 2500.3 2197.9 1743.5 1514.7 1368.4 1600.5 1651.6 1428.6 1553.2
2004 2675.8 2511.1 2100.9 1745.2 1573.0 1483.7 1584.9 1578.0 1482.2 1557.2
2005 2561.9 2243.0 2205.8 1724.9 1522.6 1534.1 1686.6 1695.1 1422.5 1428.2
2006 2165.3 2144.4 2126.4 1681.0 1526.3 1550.9 1758.7 1751.7 1462.1 1644.2
2007 2475.6 2567.0 2128.8 1810.1 1559.1 1555.2 1659.9 1896.1 1590.5 1627.8
2008 2734.0 2503.4 2278.2 1823.9 1576.4 1604.2 1708.6 1682.9 1460.9 1635.8
2009 2729.7 2332.5 2170.7 1741.3 1504.0 1527.8 1658.0 1736.5 1575.0 1666.5
2010 2809.8 2481.0 2142.9 1691.8 1617.3 1649.5 1825.8 1878.9 1637.5 1664.9
2011 2888.6 2452.4 2230.5 1825.0 1667.4 1657.3 1890.5 1891.8 1655.6 1744.5
2012 2756.2 2500.7 2127.8 1953.1 1873.8 1868.4 2069.8 2008.8 1807.2 1901.1
2013 2878.8 2567.2 2521.1 1967.5 1752.5 1742.9 1926.3 1927.4 1767.0 1866.8
2014 3204.1 2741.2 2557.9 1961.7 1810.2 1745.4 1881.0 1933.1 1809.3 1912.8
2015 3115.0 2925.2 2591.3 2007.9 1858.1 1899.9 2067.7 2052.7 1901.3 1987.3
2016 3091.7 2652.3 2356.3 2083.8 1965.8 2000.7 2186.6 2208.4 1947.8 1925.2
2017 2914.2 2340.6 2523.7 1932.5 1892.5 1910.9 2142.1 2094.3 1920.9 2032.0
2018 3335.0 2705.9 2792.6 2346.3 2050.9 2058.7 2344.6 2307.7 2151.5 2279.1
2019 3399.9 2999.2 2899.9 2201.1 2121.0 2115.2 2407.5 2437.2 2215.6 2472.3
Nov Dec
2000 1908.5 2587.5
2001 1701.0 2120.2
2002 1913.6 2378.9
2003 1753.6 2263.7
2004 1782.8 2327.7
2005 1663.4 2326.4
2006 1765.4 2122.8
2007 1834.5 2399.2
2008 1868.9 2399.7
2009 1776.2 2491.9
2010 1973.3 2714.1
2011 2031.9 2541.9
2012 2167.8 2503.9
2013 2316.9 2920.8
2014 2357.5 2679.2
2015 2249.1 2588.2
2016 2159.4 2866.3
2017 2357.7 3084.5
2018 2709.9 2993.1
2019
# A tibble: 12 × 3
month media sd
<fct> <dbl> <dbl>
1 Jan 2806. 309.
2 Feb 2502. 223.
3 Mar 2325. 241.
4 Apr 1886. 175.
5 May 1708. 194.
6 Jun 1691. 216.
7 Jul 1864. 261.
8 Aug 1889. 237.
9 Sep 1687. 242.
10 Oct 1788. 260.
11 Nov 2015. 284.
12 Dec 2543. 278.
Se trabalharmos com séries temporais de alta frequência, o mesmo processo deve ser feito para cada frequência potencinal da série. Por exemplo:
PPS: Possíveis Padrões Sazonais
dia (dividido em horas)
Semana
Mês
library(lubridate)
library(xts)
UKgrid_df <- data.frame(time = index(UKgrid), UKgrid = as.numeric(UKgrid)) |>
mutate(hour = hour(time),
weekday = wday(time, label = TRUE, abbr = TRUE),
month = factor(month.abb[month(time)], levels = month.abb))
UKgrid_hour <- UKgrid_df |> group_by(hour) |> summarise(media = mean(UKgrid, na.rm = TRUE), std = sd(UKgrid, na.rm = TRUE))
UKgrid_hour
# A tibble: 24 × 3
hour media std
<int> <dbl> <dbl>
1 0 55622. 9259.
2 1 54343. 9482.
3 2 52920. 9485.
4 3 51546. 9165.
5 4 50509. 8557.
6 5 51747. 8657.
7 6 59048. 10489.
8 7 68627. 13347.
9 8 73581. 13003.
10 9 76320. 12557.
# ℹ 14 more rows
plot_ly(UKgrid_hour) |>
add_lines(x = ~ hour, y = ~media, name = "Mean") %>%
add_lines(x = ~ hour, y = ~std, name = "Standard Deviation", yaxis = "y2",
line = list(color = "red", dash = "dash", width = 3)) %>%
layout(title = "The UK Grid National Demand - Hourly Average vs. Standard Deviation",
yaxis = list(title = "Mean"),
yaxis2 = list(overlaying = "y",
side = "right",
title = "Standard Deviation"),
xaxis = list(title="Hour of the day"),
legend = list(x = 0.05, y = 0.9),
margin = list(l = 50, r = 50))
UKgrid_df |>
filter(hour == 3 | hour == 10) |>
group_by(hour, weekday) |> summarise(media = mean(UKgrid, na.rm = TRUE), std = sd(UKgrid, na.rm = TRUE)) |>
plot_ly (x = ~weekday, y = ~media, type ="bar", color = ~factor(hour)) %>%
layout(title = "The Hourly Average Demand by Weekday",
yaxis = list(title = "Mean"),
xaxis = list(title = "Weekday"))
UKgrid_df |> group_by(hour, weekday) |> summarise(media = mean(UKgrid, na.rm = TRUE), std = sd(UKgrid, na.rm = TRUE)) |>
plot_ly (x = ~weekday, y = ~media, type ="bar", color = ~factor(hour)) %>%
layout(title = "The Hourly Average Demand by Weekday",
yaxis = list(title = "Mean"),
xaxis = list(title = "Weekday"))
UKgrid_df |>
filter(hour == 3 | hour == 9 | hour == 15) |>
group_by(hour, month) |> summarise(media = mean(UKgrid, na.rm = TRUE), std = sd(UKgrid, na.rm = TRUE)) |>
plot_ly (x = ~month, y = ~media, type ="bar", color = ~factor(hour)) %>%
layout(title = "The Hourly Average Demand by Month",
yaxis = list(title = "Mean"),
xaxis = list(title = "Month"))
Jan Feb Mar Apr May Jun Jul Aug Sep Oct
2000 2510.5 2330.7 2050.6 1783.3 1632.9 1513.1 1525.6 1653.1 1475.0 1567.8
2001 2677.0 2309.5 2246.6 1807.2 1522.4 1444.4 1598.1 1669.2 1494.1 1649.1
2002 2487.6 2242.4 2258.4 1881.0 1611.5 1591.4 1748.4 1725.7 1542.2 1645.9
2003 2700.5 2500.3 2197.9 1743.5 1514.7 1368.4 1600.5 1651.6 1428.6 1553.2
2004 2675.8 2511.1 2100.9 1745.2 1573.0 1483.7 1584.9 1578.0 1482.2 1557.2
2005 2561.9 2243.0 2205.8 1724.9 1522.6 1534.1 1686.6 1695.1 1422.5 1428.2
2006 2165.3 2144.4 2126.4 1681.0 1526.3 1550.9 1758.7 1751.7 1462.1 1644.2
2007 2475.6 2567.0 2128.8 1810.1 1559.1 1555.2 1659.9 1896.1 1590.5 1627.8
2008 2734.0 2503.4 2278.2 1823.9 1576.4 1604.2 1708.6 1682.9 1460.9 1635.8
2009 2729.7 2332.5 2170.7 1741.3 1504.0 1527.8 1658.0 1736.5 1575.0 1666.5
2010 2809.8 2481.0 2142.9 1691.8 1617.3 1649.5 1825.8 1878.9 1637.5 1664.9
2011 2888.6 2452.4 2230.5 1825.0 1667.4 1657.3 1890.5 1891.8 1655.6 1744.5
2012 2756.2 2500.7 2127.8 1953.1 1873.8 1868.4 2069.8 2008.8 1807.2 1901.1
2013 2878.8 2567.2 2521.1 1967.5 1752.5 1742.9 1926.3 1927.4 1767.0 1866.8
2014 3204.1 2741.2 2557.9 1961.7 1810.2 1745.4 1881.0 1933.1 1809.3 1912.8
2015 3115.0 2925.2 2591.3 2007.9 1858.1 1899.9 2067.7 2052.7 1901.3 1987.3
2016 3091.7 2652.3 2356.3 2083.8 1965.8 2000.7 2186.6 2208.4 1947.8 1925.2
2017 2914.2 2340.6 2523.7 1932.5 1892.5 1910.9 2142.1 2094.3 1920.9 2032.0
2018 3335.0 2705.9 2792.6 2346.3 2050.9 2058.7 2344.6 2307.7 2151.5 2279.1
2019 3399.9 2999.2 2899.9 2201.1 2121.0 2115.2 2407.5 2437.2 2215.6 2472.3
Nov Dec
2000 1908.5 2587.5
2001 1701.0 2120.2
2002 1913.6 2378.9
2003 1753.6 2263.7
2004 1782.8 2327.7
2005 1663.4 2326.4
2006 1765.4 2122.8
2007 1834.5 2399.2
2008 1868.9 2399.7
2009 1776.2 2491.9
2010 1973.3 2714.1
2011 2031.9 2541.9
2012 2167.8 2503.9
2013 2316.9 2920.8
2014 2357.5 2679.2
2015 2249.1 2588.2
2016 2159.4 2866.3
2017 2357.7 3084.5
2018 2709.9 2993.1
2019
m.c.seq.row..seq.n...seq.col..drop...FALSE.
2005-04-01 00:00:00 65080
2005-04-01 01:00:00 68207
2005-04-01 02:00:00 69172
2005-04-01 03:00:00 66769
2005-04-01 04:00:00 64660
2005-04-01 05:00:00 65209
2005-04-01 06:00:00 72376
2005-04-01 07:00:00 82679
2005-04-01 08:00:00 88706
2005-04-01 09:00:00 91108
...
2019-10-08 14:00:00 56875
2019-10-08 15:00:00 57638
2019-10-08 16:00:00 61961
2019-10-08 17:00:00 66902
2019-10-08 18:00:00 69943
2019-10-08 19:00:00 71484
2019-10-08 20:00:00 66677
2019-10-08 21:00:00 60445
2019-10-08 22:00:00 52730
2019-10-08 23:00:00 45461
Definição
A representação espectral de uma série temporal estacionária \(\{Z_t\}\) consiste na decomposição da série em uma soma de componentes sinusoidais, cujos coeficientes são aleatórios e não correlacionados. Juntamente com essa decomposição, ocorre uma decomposição correspondente da função de autocovariância de \(\{Z_t\}\) também em termos de sinusoides.
Função de densidade espectral
Seja \(\{Z_t, t \in \mathbb{Z} \}\) um processo estocástico estacionário com média zero e função de autocovariância tal que \[\sum_{\tau = -\infty}^{\infty} |\gamma(\tau)| < \infty, \quad \text{(condição de independência assintótica)}.\]
A função de densidade espectral (ou espectro) de \(Z_t\) é definido por \[f(\lambda) = \dfrac{1}{2\pi} \sum_{\tau = -\infty}^{\infty} \gamma(\tau) e^{-i\lambda \tau}, -\infty < \lambda < \infty,\] em que \(e^{i \lambda} = \cos \lambda + i\sin \lambda \quad\) e \(\quad i = \sqrt{-1}\).
Propriedades:
\[\gamma(\tau) = \int_{-\pi}^{\pi} e^{i\lambda \tau} f(\lambda) d\lambda \quad \tau \in \mathbb{Z}.\]
DGP1: \[Z_t \sim RB(0, \sigma^2)\]
DGP2: \[Z_t = \dfrac{1}{3}(a_{t-1} + a_t + a_{t+1}), \quad a_t \sim RB(0, \sigma^2)\]
DGP3: \[Z_t = \phi Z_{t-1} + a_t, \quad a_t \sim RB(0, \sigma^2)\]
Se \(\{Z_t, t = 1, \cdots, N\}\) for uma série temporal estacionária com função de autocovariância \(\gamma(\cdot)\) e densidade espectral \(f(\cdot)\), então:
Transformada Discreta de Fourier (TDF)
Seja \(\{Z_t, t = 1, \cdots, N\}\) uma realização de um processo estocástico com média zero. Então, a TDF é dada por \[d_j^{(N)} = \dfrac{1}{\sqrt{N}} \sum_{t = 1}^N Z_t e^{-i 2 \pi j t / N}, \quad j = 0, \cdots, N-1\]
Isto sugere que, dada uma realização \(\{Z_t, t = 1, \cdots, N\}\), um estimador para \(2 \pi f(\omega_j)\) é \[I_j^{(N)} = |d_j^{(N)}|^2 = \dfrac{1}{N} \big | \sum_{t = 1}^N Z_t e^{-i \omega_j t} \big |^2,\]
em que \(\omega_j = \dfrac{2 \pi j}{N},\) \(j = 0, \cdots, N-1\) são chamadas de frequências de Fourier.
Periodograma
Seja \(\{Z_t, t = 1, \cdots, N\}\) uma realização de um processo estocástico estacionário. Definimos o periodograma como \[I^{(N)}_j = |d_j^{(N)}|^2, \quad j = 0, \cdots, N - 1\]
Propriedades
\[I^{(N)}(\lambda) = |d^{(N)}(\lambda)|^2 = \dfrac{1}{N} \big | \sum_{t = 1}^N Z_t e^{-i \lambda t} \big |^2, \quad \lambda \in \{\frac{2 \pi j}{N}: j = 0, \cdots, N - 1 \}\]
Podemos utilizar a função periodogram()
do pacote TSA
ou alternativamente, utilizar uma implementação própria.
1/0.081481481 = 12.27273
[,1] [,2]
[1,] 0.007407407 1.269290e+02
[2,] 0.014814815 1.978911e+02
[3,] 0.022222222 3.561589e+01
[4,] 0.029629630 1.258122e+02
[5,] 0.037037037 7.006523e+01
[6,] 0.044444444 5.142501e+02
[7,] 0.051851852 1.047800e+03
[8,] 0.059259259 6.071532e+02
[9,] 0.066666667 2.217274e+03
[10,] 0.074074074 8.302572e+03
[11,] 0.081481481 1.169621e+05
[12,] 0.088888889 2.365537e+04
[13,] 0.096296296 4.512837e+03
[14,] 0.103703704 2.486425e+03
[15,] 0.111111111 1.321912e+03
[16,] 0.118518519 4.937583e+02
[17,] 0.125925926 2.487421e+02
[18,] 0.133333333 1.384633e+02
[19,] 0.140740741 1.086324e+03
[20,] 0.148148148 5.429376e+02
[21,] 0.155555556 3.183138e+03
[22,] 0.162962963 1.097292e+04
[23,] 0.170370370 2.240503e+04
[24,] 0.177777778 3.379802e+03
[25,] 0.185185185 1.476595e+03
[26,] 0.192592593 5.639251e+02
[27,] 0.200000000 6.691914e+02
[28,] 0.207407407 4.506652e+02
[29,] 0.214814815 4.227881e+02
[30,] 0.222222222 8.464879e+02
[31,] 0.229629630 4.859353e+02
[32,] 0.237037037 6.195829e+02
[33,] 0.244444444 1.474465e+03
[34,] 0.251851852 3.860543e+03
[35,] 0.259259259 1.474002e+02
[36,] 0.266666667 1.544444e+02
[37,] 0.274074074 4.574005e-01
[38,] 0.281481481 3.195592e+01
[39,] 0.288888889 1.588182e+01
[40,] 0.296296296 2.072899e+00
[41,] 0.303703704 4.719144e+00
[42,] 0.311111111 6.511984e+01
[43,] 0.318518519 7.469490e+00
[44,] 0.325925926 2.729236e+02
[45,] 0.333333333 4.240071e+03
[46,] 0.340740741 2.537939e+02
[47,] 0.348148148 1.588545e+02
[48,] 0.355555556 8.422694e+01
[49,] 0.362962963 5.987252e+01
[50,] 0.370370370 9.698635e+01
[51,] 0.377777778 4.352332e+01
[52,] 0.385185185 1.740587e+02
[53,] 0.392592593 1.218460e+02
[54,] 0.400000000 1.889818e+02
[55,] 0.407407407 3.221634e+02
[56,] 0.414814815 2.976223e+03
[57,] 0.422222222 1.691171e+02
[58,] 0.429629630 1.484993e+02
[59,] 0.437037037 1.761796e+02
[60,] 0.444444444 3.558530e+01
[61,] 0.451851852 1.074108e+02
[62,] 0.459259259 2.790024e+02
[63,] 0.466666667 2.958292e+02
[64,] 0.474074074 4.943861e+01
[65,] 0.481481481 6.444015e+01
[66,] 0.488888889 1.269115e+02
[67,] 0.496296296 9.920785e+01
Utilizando a função própria, o pico acontece em 1 / 0.083333333 = 12
freq periodograma
1 0.000000000 7.450021e+01
2 0.007575758 6.158361e+01
3 0.015151515 1.006007e+02
4 0.022727273 1.534280e+01
5 0.030303030 6.478658e+01
6 0.037878788 8.589472e+01
7 0.045454545 1.947432e+02
8 0.053030303 5.216535e+02
9 0.060606061 2.029744e+02
10 0.068181818 1.209833e+03
11 0.075757576 6.609245e+03
12 0.083333333 6.415949e+04
13 0.090909091 5.333546e+03
14 0.098484848 1.311309e+03
15 0.106060606 4.120956e+02
16 0.113636364 1.051530e+02
17 0.121212121 6.160094e+01
18 0.128787879 3.014609e+02
19 0.136363636 1.756392e+02
20 0.143939394 2.059318e+02
21 0.151515152 1.439459e+02
22 0.159090909 1.232017e+03
23 0.166666667 1.622204e+04
24 0.174242424 1.651894e+03
25 0.181818182 4.092857e+02
26 0.189393939 4.973683e+02
27 0.196969697 2.227047e+02
28 0.204545455 1.652857e+02
29 0.212121212 3.999853e+02
30 0.219696970 2.861285e+02
31 0.227272727 1.012028e+02
32 0.234848485 2.726121e+02
33 0.242424242 3.077507e+02
34 0.250000000 2.470446e+03
35 0.257575758 1.118472e+02
36 0.265151515 3.807159e+01
37 0.272727273 9.370731e+00
38 0.280303030 1.434404e+01
39 0.287878788 6.496930e-01
40 0.295454545 9.726165e-01
41 0.303030303 6.876308e+00
42 0.310606061 3.906259e+01
43 0.318181818 2.946280e+00
44 0.325757576 1.156887e+02
45 0.333333333 2.120036e+03
46 0.340909091 1.204138e+02
47 0.348484848 9.807695e+01
48 0.356060606 5.365772e+01
49 0.363636364 2.229371e+01
50 0.371212121 4.439512e+01
51 0.378787879 5.985471e+00
52 0.386363636 1.011943e+02
53 0.393939394 7.175569e+01
54 0.401515152 2.148975e+02
55 0.409090909 1.913888e+02
56 0.416666667 1.209661e+03
57 0.424242424 3.599827e+01
58 0.431818182 1.075048e+02
59 0.439393939 5.107255e+01
60 0.446969697 7.137482e+01
61 0.454545455 2.989913e+01
62 0.462121212 2.934238e+01
63 0.469696970 8.765993e+01
64 0.477272727 9.743590e+01
65 0.484848485 6.408620e+00
66 0.492424242 5.652648e+01
67 0.500000000 8.233381e+01
68 0.507575758 5.652648e+01
69 0.515151515 6.408620e+00
70 0.522727273 9.743590e+01
71 0.530303030 8.765993e+01
72 0.537878788 2.934238e+01
73 0.545454545 2.989913e+01
74 0.553030303 7.137482e+01
75 0.560606061 5.107255e+01
76 0.568181818 1.075048e+02
77 0.575757576 3.599827e+01
78 0.583333333 1.209661e+03
79 0.590909091 1.913888e+02
80 0.598484848 2.148975e+02
81 0.606060606 7.175569e+01
82 0.613636364 1.011943e+02
83 0.621212121 5.985471e+00
84 0.628787879 4.439512e+01
85 0.636363636 2.229371e+01
86 0.643939394 5.365772e+01
87 0.651515152 9.807695e+01
88 0.659090909 1.204138e+02
89 0.666666667 2.120036e+03
90 0.674242424 1.156887e+02
91 0.681818182 2.946280e+00
92 0.689393939 3.906259e+01
93 0.696969697 6.876308e+00
94 0.704545455 9.726165e-01
95 0.712121212 6.496930e-01
96 0.719696970 1.434404e+01
97 0.727272727 9.370731e+00
98 0.734848485 3.807159e+01
99 0.742424242 1.118472e+02
100 0.750000000 2.470446e+03
101 0.757575758 3.077507e+02
102 0.765151515 2.726121e+02
103 0.772727273 1.012028e+02
104 0.780303030 2.861285e+02
105 0.787878788 3.999853e+02
106 0.795454545 1.652857e+02
107 0.803030303 2.227047e+02
108 0.810606061 4.973683e+02
109 0.818181818 4.092857e+02
110 0.825757576 1.651894e+03
111 0.833333333 1.622204e+04
112 0.840909091 1.232017e+03
113 0.848484848 1.439459e+02
114 0.856060606 2.059318e+02
115 0.863636364 1.756392e+02
116 0.871212121 3.014609e+02
117 0.878787879 6.160094e+01
118 0.886363636 1.051530e+02
119 0.893939394 4.120956e+02
120 0.901515152 1.311309e+03
121 0.909090909 5.333546e+03
122 0.916666667 6.415949e+04
123 0.924242424 6.609245e+03
124 0.931818182 1.209833e+03
125 0.939393939 2.029744e+02
126 0.946969697 5.216535e+02
127 0.954545455 1.947432e+02
128 0.962121212 8.589472e+01
129 0.969696970 6.478658e+01
130 0.977272727 1.534280e+01
131 0.984848485 1.006007e+02
132 0.992424242 6.158361e+01
periodogram()
calcula o periodograma para as primeiras \(N/2\) frequências. Apenas \(N/2\) são suficientes pois depois os valores se repetem (veja o periodograma feito com nossa próoria função)periodogram()
e pela nossa função periodograma()
, não são os mesmos.periodogram()
utiliza uma versão padronizada do periodograma. Isto é opcional pois, embora os numeros sejam diferentes, o mesmo padrão será observado.Carlos Trucíos (IMECC/UNICAMP) | ME607 - Séries Temporais | ctruciosm.github.io