Airflow Xcom Exclusive [portable]

from airflow.providers.common.sql.operators.sql import SQLExecuteQueryOperator purge_old_xcoms = SQLExecuteQueryOperator( task_id='purge_old_xcoms', conn_id='airflow_db', # Points to your backend metadata db sql=""" DELETE FROM xcom WHERE timestamp < NOW() - INTERVAL '30 days'; """ ) Use code with caution. 5. XCom Troubleshooting Guide

Using task_instance.xcom_pull(task_ids='upsteam_task', key='return_value') .

The metadata DB stores only small JSON pointers; actual data lives in S3 with an automatic 24-hour TTL. Debugging becomes linear: each task’s inputs are fully determined by its explicit upstream keys.

While powerful, XComs are not a magic bullet for data transfer. They have strict limitations, largely because they are stored in the Airflow Metadata Database (e.g., MySQL, PostgreSQL). airflow xcom exclusive

By understanding both the power and the boundaries of XCom, you can design data pipelines that are not only correct and maintainable but also performant at any scale. Use XCom for what it does best: . Leave the heavy lifting to the dedicated systems that Airflow orchestrates so well.

@task def extract(): return "data": [1, 2, 3] # automatically stored as XCom

Which and deployment environment (e.g., MWAA, Astro, local Docker) are you running? from airflow

Mastering Airflow XCom: The Exclusive Guide to Cross-Communication

Downstream tasks pull data using xcom_pull .

Explicitly switch do_xcom_push=False on heavy CLI or shell tasks. While powerful, XComs are not a magic bullet

: XComs consist of a key , value , and timestamp , along with attributes for the specific Task Instance and DAG Run.

The TaskFlow API allows you to pass data between tasks automatically, making the code much cleaner and removing the need for manual xcom_pull commands.

Which (AWS, GCP, or Azure) your Airflow environment runs on.

: Keep default XCom payloads under a few kilobytes.

Go to Top