LLM Copilot¶
This walkthrough covers the functime.llm
module, which contains namespaced polars dataframe methods to interoperate Large Language Models (LLMs) with functime.
Let's use OpenAI's GPT models to analyze commodity price forecasts created by a functime forecaster. By default we use gpt-3.5-turbo
.
Load data¶
import os
from IPython.display import display, Markdown
os.environ["OPENAI_API_KEY"] = "..." # Your API key here
%%capture
import polars as pl
from functime.cross_validation import train_test_split
from functime.forecasting import knn
import functime.llm # We must import this to override the `llm` namespace for pl.DataFrame
y = pl.read_parquet("../../data/commodities.parquet")
entity_col, time_col, target_col = y.columns
test_size = 30
freq = "1mo"
y_train, y_test = train_test_split(test_size)(y)
print("🎯 Target variable (y) -- train set:")
y_train.collect()
🎯 Target variable (y) -- train set:
commodity_type | time | price |
---|---|---|
str | datetime[ns] | f64 |
"Rapeseed oil" | 2002-02-01 00:00:00 | 423.45 |
"Rapeseed oil" | 2002-03-01 00:00:00 | 415.85 |
"Rapeseed oil" | 2002-04-01 00:00:00 | 410.77 |
"Rapeseed oil" | 2002-05-01 00:00:00 | 414.82 |
"Rapeseed oil" | 2002-06-01 00:00:00 | 451.04 |
"Rapeseed oil" | 2002-07-01 00:00:00 | 477.29 |
"Rapeseed oil" | 2002-08-01 00:00:00 | 521.14 |
"Rapeseed oil" | 2002-09-01 00:00:00 | 525.01 |
"Rapeseed oil" | 2002-10-01 00:00:00 | 539.31 |
"Rapeseed oil" | 2002-11-01 00:00:00 | 593.04 |
"Rapeseed oil" | 2002-12-01 00:00:00 | 616.49 |
"Rapeseed oil" | 2003-01-01 00:00:00 | 623.72 |
… | … | … |
"Liquefied natu… | 2019-10-01 00:00:00 | 9.98 |
"Liquefied natu… | 2019-11-01 00:00:00 | 10.04 |
"Liquefied natu… | 2019-12-01 00:00:00 | 10.06 |
"Liquefied natu… | 2020-01-01 00:00:00 | 9.89 |
"Liquefied natu… | 2020-02-01 00:00:00 | 9.89 |
"Liquefied natu… | 2020-03-01 00:00:00 | 10.21 |
"Liquefied natu… | 2020-04-01 00:00:00 | 10.01 |
"Liquefied natu… | 2020-05-01 00:00:00 | 10.08 |
"Liquefied natu… | 2020-06-01 00:00:00 | 8.97 |
"Liquefied natu… | 2020-07-01 00:00:00 | 7.79 |
"Liquefied natu… | 2020-08-01 00:00:00 | 6.34 |
"Liquefied natu… | 2020-09-01 00:00:00 | 5.88 |
We'll make a prediction using a knn forecaster.
# Univariate time-series fit with automated lags
forecaster = knn(freq="1mo", lags=24)
forecaster.fit(y=y_train)
y_pred = forecaster.predict(fh=test_size)
y_pred.head()
commodity_type | time | price |
---|---|---|
str | datetime[μs] | f64 |
"Soybean meal" | 2020-10-01 00:00:00 | 384.690002 |
"Soybean meal" | 2020-11-01 00:00:00 | 393.839996 |
"Soybean meal" | 2020-12-01 00:00:00 | 388.085999 |
"Soybean meal" | 2021-01-01 00:00:00 | 374.208008 |
"Soybean meal" | 2021-02-01 00:00:00 | 370.649994 |
We'll also provide a short description of the dataset to aid the LLM in its analysis.
dataset_context = "This dataset comprises of forecasted commodity prices between 2020 to 2023."
Analyze Forecasts¶
Let's take a look at aluminum and European banana prices. You can select multiple (or just one) entity / time-series to analyze through the basket
variable.
analysis = y_pred.llm.analyze(
context="This dataset comprises of forecasted commodity prices between 2020 to 2023.",
basket=["Aluminum", "Banana, Europe"]
)
display(Markdown(analysis))
- The Aluminum price shows a downward trend from October 2020 to March 2021, with a decrease of 5.9%. However, it starts to recover from April 2021 and shows a slight upward trend until March 2023, reaching a 0.3% increase compared to October 2020.
- The Banana price in Europe exhibits a relatively stable trend from October 2020 to July 2021, with a slight decrease of 7.9%. From August 2021 to March 2023, it shows a consistent upward trend, with an overall increase of 31.5% compared to October 2020.
- Both Aluminum and Banana prices have seasonality patterns, with prices fluctuating within a certain range throughout the years.
- An anomaly can be observed in the Aluminum price in February 2021, where it experiences a significant drop of 6.8% compared to the previous month. This anomaly could be attributed to specific market conditions or external factors impacting the commodity's demand and supply.
- Another anomaly occurs in the Banana price in Europe in November and December 2021, where it remains constant at 0.902 and 0.944, respectively. This sudden stability in price may indicate an unusual market behavior or an external influence affecting the commodity's availability or pricing.
- The difference in magnitude between the Aluminum and Banana prices is substantial, with Aluminum being approximately 1825 times more expensive than Banana in Europe.
- The Banana price in Europe shows a higher degree of volatility compared to the Aluminum price, as reflected in the wider range of price fluctuations.
- The Banana price experiences a more significant increase from October 2020 to March 2023 compared to the Aluminum price, indicating potentially higher demand or supply constraints for Bananas in Europe during this period.
- The overall trend for both commodities suggests a positive outlook for Banana prices in Europe, while the Aluminum market shows a more mixed and fluctuating pattern.
- These trends and anomalies should be considered when assessing the potential profitability and risks associated with investing in the Aluminum and Banana markets between 2020 and 2023.
Compare Forecasts¶
Let's now compare the previous selection with a new one. We'll refer to these as baskets A and B.
basket_a = ["Aluminum", "Banana, Europe"]
basket_b = ["Chicken", "Cocoa"]
Now compare!
comparison = y_pred.llm.compare(
basket=basket_a,
other_basket=basket_b
)
display(Markdown(comparison))
The provided time series data consists of two dataframes: "This" and "Other". Let's compare and contrast these dataframes in terms of trend, seasonality, and anomalies.
Trend:
- Aluminum in the "This" dataframe shows a slight downward trend over time, with a decrease of 12.4% from October 2020 to March 2023.
- Chicken in the "Other" dataframe does not exhibit a clear trend, fluctuating within a relatively narrow range over the given time period.
Seasonality:
- Aluminum in the "This" dataframe does not show any noticeable seasonality pattern.
- Chicken in the "Other" dataframe also does not exhibit a distinct seasonality pattern.
Anomalies:
- Aluminum in the "This" dataframe experienced a significant drop of 6.4% from January 2021 to February 2021, followed by a slight recovery.
- Banana in the "This" dataframe shows a relatively stable value over time, with no significant anomalies observed.
- Chicken in the "Other" dataframe does not display any notable anomalies.
- Cocoa in the "Other" dataframe experienced a sudden increase of 8.7% from February 2022 to March 2022.
In summary, while Aluminum in the "This" dataframe exhibits a downward trend and a notable anomaly, Chicken in the "Other" dataframe shows no clear trend or seasonality with no significant anomalies.