Airflow Xcom 'link'

If Task A pushes a value that Task B needs, but Task A fails, Task B will try to pull None . Always handle the case where XCom might return None (check for if data is None ).

: Do not pass raw Pandas DataFrames, massive CSV logs, or heavy binary media files through standard database XComs. This causes severe database bloat and memory exhaustion. airflow xcom

from airflow.decorators import dag, task from datetime import datetime @dag(start_date=datetime(2026, 1, 1), schedule=None, catchup=False) def taskflow_xcom_pipeline(): @task def generate_data(): # Automatically pushed to XCom key "return_value" return "status": "success", "processed_records": 1450 @task def process_data(input_metadata: dict): # Automatically pulls the XCom value from generate_data print(f"Processing execution: input_metadata['processed_records'] records.") # Setting up implicit dependencies and data flow data_summary = generate_data() process_data(data_summary) taskflow_xcom_pipeline() Use code with caution. Option B: Traditional Operators (Explicit Push/Pull) If Task A pushes a value that Task

[Task A] --(push dataframe)--> [Custom XCom Backend] --(upload payload)--> [AWS S3 / GCS] | (write URI only) v [Airflow Metadata Database] | (read URI only) v [Task B] <--(pull dataframe)-- [Custom XCom Backend] <--(download payload)-- [AWS S3 / GCS] Implementing a Custom S3 XCom Backend This causes severe database bloat and memory exhaustion

: Standard database records do not auto-delete. Set up a periodic maintenance DAG that runs a clean-up query against the xcom database table to purge entries older than your history retention policy (e.g., 30 days). If you are currently designing a data pipeline, tell me: