Time series forecasting with LLM-based foundation models and scalable AIOps on AWS

Time series forecasting is critical for decision-making across industries. From predicting traffic flow to sales forecasting, accurate predictions enable organizations to make informed decisions, mitigate risks, and allocate resources efficiently. However, traditional machine learning approaches often require extensive data-specific tuning and model customization, resulting in lengthy and resource-heavy development.

Enter Chronos, a cutting-edge family of time series models that uses the power of large language model (LLM) architectures to break through these hurdles. As a foundation model, Chronos is pre-trained on large and diverse datasets, enabling it to generalize forecasting capabilities across multiple domains. This innovative approach allows Chronos to excel at zero-shot forecasts—predictions made without specific training on the target dataset. Chronos outperforms task-specific models across most benchmarked datasets.

Chronos is founded on a key insight: both LLMs and time series forecasting aim to decode sequential patterns to predict future events. This parallel allows us to treat time series data as a language to be modeled by off-the-shelf transformer architectures. To make this possible, Chronos converts continuous time series data into a discrete vocabulary through a two-step process of scaling the time series by its absolute mean and then quantizing the scaled time series into a fixed number of equally spaced bins.

In this blog post, we will guide you through the process of integrating Chronos into Amazon SageMaker Pipeline using a synthetic dataset that simulates a sales forecasting scenario, unlocking accurate and efficient predictions with minimal data. You will learn how to use features to orchestrate the entire workflow from fine-tuning to deployment. By the end of this journey, you will be equipped to streamline your development process and apply Chronos to any time series data, transforming your forecasting approach.

Prerequisites

SageMaker domain access with required IAM permissions: You need to have access to a SageMaker domain with the necessary AWS Identity and Access Management (IAM) permissions to create and manage resources. Make sure that you have the required permissions to create notebooks, deploy models, and perform other tasks outlined in this post. See quick setup for Amazon SageMaker AI for instructions about setting up a SageMaker domain. To follow along, see the code in GitHub.

Click here to open the AWS console and follow along.

Overview of SageMaker Pipelines

We use SageMaker Pipelines to orchestrate training and evaluation experiments. With Amazon SageMaker Pipelines, you can:

Run multiple experiment iterations simultaneously, reducing overall processing time and cost
Monitor and visualize the performance of each experiment run with Studio integration
Invoke downstream workflows for further analysis, deployment, or model selection

Training pipeline

Generate data

The availability and quality of public time series data are limited compared to the extensive high-quality text datasets available in the natural language processing (NLP) domain. This disparity poses challenges for training models intended for zero-shot forecasting, which requires large-scale, diverse time series data. Given that we’re fine-tuning a pretrained Chronos model, we use only a small set of synthetically generated data.

To generate diverse time series patterns, the first step in our pipeline generates a synthetic dataset using a kernel bank of basis kernels. These kernels define fundamental time series patterns, including linear trends, smooth local variations, and seasonality. By combining these kernels through random binary operations, we create complex, synthetic time series data. This process allows us to generate intricate patterns from simple basis kernels.

This data processing job is accomplished using a PyTorchProcessor, which runs PyTorch code (generate_data.py) within a container managed by SageMaker. Data and other relevant artifacts for debugging are located in the default Amazon Simple Storage Service (Amazon S3) bucket associated with the SageMaker account. Logs for each step in the pipeline can be found in Amazon CloudWatch.

base_job_name = f"{pipeline_name}/data-generation-step"

script_processor = PyTorchProcessor( 
    command=['python3'],
    role=role,
    instance_count=1,
    instance_type="ml.c5.2xlarge",
    base_job_name=base_job_name,
    sagemaker_session=pipeline_session,
    framework_version='1.13',
    py_version='py39'
)

Hyperparameter search

After data generation, we fine-tune a pretrained Chronos model. Fine-tuning allows it to specialize in a specific use-case that may not be well-represented in its pretraining data. In this post, we have used amazon/chronos-t5-small but you can use any model that seems fit. The following table shows the available models.

Model	Parameters	Based on
chronos-t5-tiny	8M	t5-efficient-tiny
chronos-t5-mini	20M	t5-efficient-mini
chronos-t5-small	46M	t5-efficient-small
chronos-t5-base	200M	t5-efficient-base
chronos-t5-large	710M	t5-efficient-large

For optimal output, we use automatic model tuning to find the best version of a model through hyperparameter tuning. This step is integrated into SageMaker Pipelines and enables running multiple training jobs in parallel, employing various methods and predefined hyperparameter ranges. In our pipeline, we specifically tune the learning rate to optimize our model’s performance. With the hyperparameter tuning capability in SageMaker, we increase the likelihood that our model achieves optimal accuracy and generalization for the given task.

estimator = PyTorch(
    role=role,
    instance_type=pipeline_parameters['training_instance_type'],
    output_path=f"s3://{bucket_name}/{pipeline_name}/models/",
    instance_count=1,
    source_dir='model',
    image_uri=train_image_uri,
    entry_point=model_name + ".py",
    base_job_name = f"{pipeline_name}/training/job",
)

hyper_ranges = {
     'learning-rate': ContinuousParameter(1e-5, 1e-4),
}

objective_name = "logloss"
metric_definitions = [{"Name": objective_name, "Regex": "'loss': ([0-9\.]+),"}]

tuner_log = HyperparameterTuner(
    estimator,
    objective_name,
    hyper_ranges,
    metric_definitions,
    max_jobs=pipeline_parameters['max_jobs'], 
    max_parallel_jobs=pipeline_parameters['max_parallel_jobs'],
    objective_type="Minimize",
    base_tuning_job_name=f"{pipeline_name}/HPTuning/{model_name}",
    random_seed=10
)

Amazon SageMaker Model Registry

The selected model is then uploaded to SageMaker Model Registry, which plays a critical role in managing models that are ready for production. It stores models, organizes model versions, captures essential metadata and artifacts such as container images, and governs the approval status of each model. By using the registry, we can efficiently deploy models to accessible SageMaker environments and establish a foundation for model versioning.

registration_steps = {}

register_args = best_model.register(
    content_types=["text/csv"],
    response_types=["text/csv"],
    inference_instances=[instance_type],
    transform_instances=[instance_type],
    model_package_group_name=model_package_group_name,
    domain="MACHINE_LEARNING",
    description="Chronos",
    task="REGRESSION",
    framework="PYTORCH",
    image_uri=inference_image_uri
)
registration_steps = ModelStep(
    name=model_name, 
    step_args=register_args
)

Inference

Upon completion of our training pipeline, our model is then deployed using SageMaker hosting services, which enables the creation of an inference endpoint for real-time predictions. This endpoint allows seamless integration with applications and systems, providing on-demand access to the model’s predictive capabilities through a secure HTTPS interface. Real-time predictions can be used in scenarios such as stock price and energy demand forecasts.

endpoint_name = "chronos-endpoint-" + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
print(f"EndpointName: {endpoint_name}")
model.deploy(
    initial_instance_count=1, 
    instance_type="ml.p3.2xlarge",
    serializer=JSONSerializer(),
    deserializer=JSONDeserializer(),
    endpoint_name=endpoint_name
)

predictor = Predictor(endpoint_name=endpoint_name)

payload = {"inputs": input_data}
jstr = json.dumps(payload)

p = predictor.predict(
    jstr,
    initial_args={
        "ContentType": 'application/json'
    }
)

Sample prediction output

The following figure demonstrates a sample forecast from the Chronos endpoint.

Chronos benchmark performance

The preceding graph shows the performance evaluation of various time series forecasting models based on 27 datasets not used in training the Chronos models. The benchmark assesses zero-shot performance of Chronos models against local statistical models, task-specific models, and pretrained models. The evaluation uses two metrics: probabilistic forecasting (WQL) and point forecasting (MASE); both normalized using a Seasonal Naive baseline. The results are aggregated using geometric means. It’s noted that some of the above pretrained models had prior exposure to the benchmark datasets.

Zero shot results are from Chronos: Learning the Language of Time Series.

Conclusion

In this blog post, we’ve demonstrated how to use Amazon SageMaker AIOps features to deploy Chronos, a powerful time series forecasting model based on LLM architectures. By using SageMaker Pipelines, we’ve showcased a comprehensive approach to building, training, and deploying sophisticated forecasting models at scale. This implementation offers efficiency in model development, scalability, streamlined AIOps, real-time inference capabilities, and cost-effectiveness. The integration of Chronos with SageMaker opens up new possibilities for businesses across various sectors to implement advanced time series forecasting without extensive in-house machine learning expertise. As AI and machine learning continue to evolve, solutions like Chronos on Amazon SageMaker represent a significant step forward in making sophisticated forecasting techniques more accessible and actionable, potentially leading to more informed decision-making and improved operational efficiency across industries.

References

Feel free to leave a comment with any thoughts or questions!

About the Authors

Alston Chan is a Software Development Engineer at Amazon Ads. He builds machine learning pipelines and recommendation systems for product recommendations on the Detail Page. Outside of work, he enjoys game development and rock climbing.

Maria Masood specializes in building data pipelines and data visualizations at AWS Commerce Platform. She has expertise in Machine Learning, covering natural language processing, computer vision, and time-series analysis. A sustainability enthusiast at heart, Maria enjoys gardening and playing with her dog during her downtime.

Nick Biso is a Machine Learning Engineer at AWS Professional Services. He solves complex organizational and technical challenges using data science and engineering. In addition, he builds and deploys AI/ML models on the AWS Cloud. His passion extends to his proclivity for travel and diverse cultural experiences.

Vedere AI