Forecasting Parameters
This section describes how TimeCopilot
determines the three core forecasting parameters: freq
, h
, and seasonality
when you call
TimeCopilot.forecast(...)
.
You can:
- let the assistant infer everything automatically (recommended for a quick start),
- provide the values in plain-English inside the
query
, or - override them explicitly with keyword arguments.
What do these terms mean?
freq
: the pandas frequency string that describes the spacing of your timestamps ("H"
for hourly,"D"
for daily,"MS"
for monthly-start, etc.). It tells the models how many observations occur in one unit of seasonality.seasonality
: the length of the dominant seasonal cycle expressed in number offreq
periods (24 for hourly data with a daily cycle, 12 for monthly‐start data with a yearly cycle, …). Seeget_seasonality
for the default mapping.h
(horizon): how many future periods you want to forecast.
Pandas available frequencies
You can see here the complete list of available frequencies.
With these concepts in mind, let's see how TimeCopilot
chooses their values.
Where do the parameters come from?
TimeCopilot
follows these precedence rules:
- Natural-language query wins. If the query text mentions any of the three parameters they are extracted by an LLM agent and used first.
- Explicit keyword arguments are next.
Any argument you pass directly to
parse()
(freq=
,h=
,seasonality=
) fills the gaps left by the query. - Automatic inference is the fallback.
If a value is still unknown it is inferred from the data frame:
freq
:maybe_infer_freq(df)
seasonality
:get_seasonality(freq)
h
:2 * seasonality
Summary
Text -> kwargs -> automatic inference (from your data, df
)
Passing parameters in a natural-language query
Sometimes it's easier to embed the settings directly in your query:
import pandas as pd
from timecopilot.agent import TimeCopilot
df = pd.read_csv("https://timecopilot.s3.amazonaws.com/public/data/air_passengers.csv")
query = """
Which months will have peak passenger traffic in the next 24 months?
use 12 as seasonality
"""
tc = TimeCopilot(llm="gpt-4o")
# Passing `None` simply uses the defaults; they are shown
# here for clarity but can be omitted.
result = tc.forecast(
df=df,
query=query,
freq=None, # default, infer from query/df
h=None, # default, infer from query/df
seasonality=None, # default, infer from query/df
)
print(result.output.user_query_response)
# Based on the forecast, peak passenger traffic
# in the next 24 months is expected to occur in the months of July and August
# both in 1961 and 1962.
How does the inference happen?
Under the hood the LLM receives a system prompt like:
"Extract the following fields if they appear in the user text…"
…and returns a JSON tool call that is validated against the
DatasetParams
schema.
Supplying the parameters programmatically (skip the LLM)
If you already know the values you can skip the LLM entirely:
result = tc.forecast(
df=df,
freq="MS", # monthly-start
h=12, # one year ahead
seasonality=12, # yearly
query=None, # no natural-language query
)
Because every field is supplied, no inference or LLM call happens.
Mixed approach (query + kwargs)
You can combine both techniques. The parser fills the missing fields from the kwargs or, if still empty, infers them:
query = "Which months will have peak passenger traffic in the next 24 months?"
result = tc.forecast(
df=df,
freq="MS", # explicit override
h=None, # default, pulled from query (24)
seasonality=None, # default, inferred as 12
query=query,
)
Choosing sensible defaults
When you let TimeCopilot
infer the parameters:
freq
should be either present in the query or directly deducible from yourds
column (regular timestamps with no gaps).seasonality
defaults to the conventional period for the frequency (e.g. 7 for daily, 12 for monthly). Override it if your data behaves differently.h
defaults to twice the seasonal period—large enough for meaningful evaluation while staying quick to compute.
Note
These defaults aim to keep the first run friction-free. Fine-tune them as soon as you have more insight into your particular dataset.