> ## Documentation Index
> Fetch the complete documentation index at: https://runpod-b18f5ded-promptless-flash-lifecycle-ops-cli-only.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Product updates

> New features, fixes, and improvements for the Runpod platform.

<AccordionGroup>
  <Accordion title="July 2026" defaultOpen>
    **July 1, 2026**

    <h4><Badge color="green">New Release</Badge> Deploy Pods with private AWS ECR images - BETA </h4>

    New tutorial covering how to pull container images from private AWS ECR repositories into Runpod Pods using cross-account IAM delegation. Includes configuring ECR repository policies, adding ECR credentials in the Runpod console, and deploying a Pod with a private image, without managing credentials directly. [Read the tutorial](/tutorials/pods/use-private-ecr-images)
  </Accordion>

  <Accordion title="June 2026">
    <h4><Badge color="red">Breaking</Badge> Lifecycle operations are now CLI-only</h4>

    Flash SDK methods for endpoint and app lifecycle operations—deploy, undeploy, update, and creating or deleting apps and environments—now raise a `FlashUsageError` that points to the equivalent `flash` command. Run these operations through the [Flash CLI](/flash/cli/overview) instead, which keeps the build and manifest pipeline and local state tracking consistent.

    <h4><Badge color="green">New Release</Badge> High-Performance Network Volumes now available</h4>

    You can now attach high-performance network volumes to [Pods, Serverless endpoints, and Instant Clusters](/storage/network-volumes) for significantly faster model load times. Look for the purple diamond icon to identify compatible datacenters.

    <h4><Badge color="green">New Release</Badge> Deploy When Available</h4>

    You can now request a GPU that's currently out of capacity and get notified by email when it becomes available. Runpod saves your pod configuration so you can deploy immediately when capacity returns.

    <h4><Badge color="blue">Improvement</Badge> Hub navigation consolidated</h4>

    Hub navigation items are now consolidated into a single unified entry, making it easier to find templates and repos.

    <h4><Badge color="red">Bug Fix</Badge> Billing records now show correct data for deleted resources</h4>

    SKU, region, and creation timestamps now appear correctly in [billing views and exports](/pods/pricing) for deleted Pods and network volumes.
  </Accordion>

  <Accordion title="May 2026">
    <h4><Badge color="green">New Release</Badge> Async Jobs for Serverless</h4>

    You can now submit a job to a [Serverless endpoint](/serverless/overview) and retrieve the result asynchronously when capacity is available. Jobs queue and process automatically when a worker is free, with no always-on workers or polling loops required.

    <h4><Badge color="green">New Release</Badge> Serverless Worker Fitness Checks</h4>

    Serverless workers now run automated health checks before accepting jobs. Runpod automatically removes unhealthy workers from rotation, reducing failed requests and improving endpoint reliability.

    <h4><Badge color="green">New Release</Badge> 24GB MiG instances now available</h4>

    You can now partition H100 and RTX PRO 6000 GPUs into up to seven independent [24GB MiG instances](/references/gpu-types), giving you more granular, lower-cost access without reserving a full card.

    <h4><Badge color="green">New Release</Badge> Cost Centers now generally available</h4>

    [Cost Centers](/get-started/manage-accounts) let teams allocate and track GPU spend by project, team, or business unit. Detailed cost breakdowns are now available in billing, and all users receive itemized invoices as of May 1.

    <h4><Badge color="blue">Improvement</Badge> New Pod deploy flow with workload-first GPU selection</h4>

    The Pod deployment experience has been redesigned. Instead of picking a GPU first, you now choose a template or workload type and get recommended GPUs ranked as recommended, compatible, or incompatible. The new flow includes Save as Template, AI-assisted GPU selection, and a Notify Me When Available option for out-of-capacity cards.
  </Accordion>

  <Accordion title="April 2026">
    <h4><Badge color="green">New Release</Badge> Flash is now generally available</h4>

    [Flash](/flash/overview) is now generally available. You can run Python functions on cloud GPUs with a single `@Endpoint` decorator, with no containers or infrastructure setup required. Workers scale automatically, dependencies install on remote workers, and you can deploy production APIs with `flash deploy`.

    <h4><Badge color="green">New Release</Badge> Instant Cluster Expansion and Priority FlashBoot now live</h4>

    [Instant Clusters](/instant-clusters/overview) can now expand to more nodes faster. Priority FlashBoot reduces cold-start times for cluster workers. Both features are live with no configuration changes needed. Expanding an existing cluster is currently only available to Runpod admins. To add nodes to an existing cluster, reach out to the Runpod team.

    <h4><Badge color="green">New Release</Badge> FlashBoot for CPU Serverless now in public beta</h4>

    CPU Serverless workers now support FlashBoot, dramatically reducing cold-start times for your CPU endpoints. GA is planned for later this quarter.

    <h4><Badge color="blue">Improvement</Badge> GPU price reductions across popular SKUs</h4>

    GPU prices have been reduced across a range of SKUs, lowering the cost of your training and inference workloads. Updated pricing is reflected in the console and [pricing page](/pods/pricing).

    <h4><Badge color="red">Bug Fix</Badge> Serverless GPU exclusions now correctly respected</h4>

    GPU type exclusions set on Serverless endpoints were not being enforced, causing workloads to land on excluded GPU types and resulting in incorrect billing. The issue is now fixed, and new alerting has been added to detect recurrence.
  </Accordion>
</AccordionGroup>

<Note>
  We've updated our release notes format for easier navigation. Updates from April 2026 onwards are listed above. Browse earlier releases by year and month in the archive below.
</Note>

<Tabs>
  <Tab title="2026">
    <AccordionGroup>
      <Accordion title="March 2026">
        ## Flash beta: Run Python functions on cloud GPUs

        [Flash](/flash/overview) is now in public beta. Flash is a Python SDK that lets you run functions on Runpod Serverless GPUs with a single decorator:

        ```python theme={"theme":{"light":"github-light","dark":"github-dark"}}
        from runpod_flash import Endpoint, GpuType

        @Endpoint(
            name="hello-gpu", 
            gpu=GpuType.NVIDIA_GEFORCE_RTX_4090,
            dependencies=["torch"]
        ) 
        async def hello():  # This function runs on Runpod
            import torch
            gpu_name = torch.cuda.get_device_name(0)
            print(f"Hello from your GPU! ({gpu_name})")
            return {"gpu": gpu_name}

        asyncio.run(hello())
        print("Done!") # This runs locally
        ```

        **Key features:**

        * **Remote execution**: Mark functions with `@Endpoint` to run on GPUs/CPUs automatically.
        * **Auto-scaling**: Workers scale from 0 to N based on demand.
        * **Dependency management**: Packages install automatically on remote workers.
        * **Two patterns**: Queue-based endpoints for batch work, load-balanced endpoints for REST APIs
        * **Flash apps**: Build production-ready APIs with `flash init`, `flash dev`, and `flash deploy`

        **Get started:**

        <CardGroup cols={2}>
          <Card title="Overview" href="/flash/overview" icon="book">
            Learn more about Flash.
          </Card>

          <Card title="Quickstart" href="/flash/quickstart" icon="bolt">
            Run your first GPU workload in 5 minutes.
          </Card>

          <Card title="Create endpoints" href="/flash/create-endpoints" icon="code">
            Learn queue-based and load-balanced patterns.
          </Card>

          <Card title="Flash CLI" href="/flash/cli/overview" icon="terminal">
            Development and deployment commands.
          </Card>
        </CardGroup>

        ## Flash: Multi-datacenter deployments

        Flash now supports deploying endpoints to [multiple datacenters](/flash/configuration/parameters#datacenter) simultaneously. Pass a list of datacenters to distribute your workload across regions for improved availability and reduced latency. You can also attach [network volumes per datacenter](/flash/configuration/storage#multi-datacenter-volumes) for region-specific data access.
      </Accordion>

      <Accordion title="February 2026">
        ## New Public Endpoints and expanded examples

        **[New Public Endpoints](/public-endpoints/reference):** Expansion of available models across all categories.

        * **Video:** [SORA 2](/public-endpoints/models/sora-2) and [SORA 2 Pro](/public-endpoints/models/sora-2-pro), [Kling v2.1](/public-endpoints/models/kling-v2-1) and [v2.6 Motion Control](/public-endpoints/models/kling-v2-6-motion-control), [WAN 2.6](/public-endpoints/models/wan-2-6-t2v).
        * **Image:** [Seedream 4.0](/public-endpoints/models/seedream-4-t2i).
        * **Text:** [Qwen3 32B](/public-endpoints/models/qwen3-32b), [IBM Granite 4.0](/public-endpoints/models/granite-4).
        * **Audio:** [Chatterbox Turbo](/public-endpoints/models/chatterbox-turbo) for text-to-speech.

        **New integrations and guides:**

        * [Vercel AI SDK integration](/public-endpoints/ai-sdk): New `@runpod/ai-sdk-provider` package for TypeScript projects with streaming, text generation, and image generation support.
        * [AI coding tools guide](/public-endpoints/ai-coding-tools): Configure OpenCode, Cursor, and Cline to use Runpod Public Endpoints as your model provider.

        **[New tutorials](/tutorials/introduction/overview):**

        * [Build a text-to-video pipeline](/tutorials/public-endpoints/text-to-video-pipeline): Chain multiple Public Endpoints to generate videos from text prompts.
        * [Deploy cached models](/tutorials/serverless/model-caching-text): Reduce cold start times with model caching.
        * [Integrate Serverless with web applications](/tutorials/serverless/generate-sdxl-turbo): Build a complete image generation app.
        * [Build a chatbot with Gemma 3](/tutorials/serverless/run-gemma-7b): Deploy vLLM with OpenAI API compatibility.
        * [Run Ollama on Pods](/tutorials/pods/run-ollama): Set up Ollama for LLM inference.
        * [Build Docker images with Bazel](/tutorials/pods/build-docker-images): Containerize your applications.
      </Accordion>

      <Accordion title="January 2026">
        ## GitHub release rollback GA and load balancing Serverless repos in beta

        * [GitHub release rollback](/serverless/workers/github-integration#roll-back-to-a-previous-build): Roll back your Serverless endpoint to any previous build from the console. Restore an earlier version when you encounter issues without waiting for a new GitHub release.
        * [Load balancing Serverless repos (beta)](/hub/publishing-guide): Load balancing endpoints are now available in the Hub. Publish or convert any listing to load balancer type by setting `"endpointType": "LB"` in your hub.json file, then deploy as a Serverless endpoint or Pod from the Hub page. Maintain a single listing for your model and let users choose their deployment method—autoscaling Serverless or dedicated Pod resources.
      </Accordion>
    </AccordionGroup>
  </Tab>

  <Tab title="2025">
    <AccordionGroup>
      <Accordion title="December 2025">
        ## Pod migration in beta and Serverless development guides

        * [Pod migration (beta)](/references/troubleshooting/pod-migration): Migrate your Pod to a new machine when your stopped Pod's GPU is occupied. Provisions a new Pod with the same specifications and automatically transfers your data to an available machine.
        * [New Serverless development guides](/serverless/overview): We've added a comprehensive new set of guides for developing, testing, and debugging Serverless endpoints.
      </Accordion>

      <Accordion title="September 2025">
        ## Slurm Clusters GA, cached models in beta, and new Public Endpoints available

        * [Slurm Clusters are now generally available](/instant-clusters/slurm-clusters): Deploy production-ready HPC clusters in seconds. These clusters support multi-node performance for distributed training and large-scale simulations with pay-as-you-go billing and no idle costs.
        * [Cached models are now in beta](/serverless/endpoints/model-caching): Eliminate model download times when starting workers. The system places cached models on host machines before workers start, prioritizing hosts with your model already available for instant startup.
        * [New Public Endpoints available](/public-endpoints/overview): [WAN 2.5](/public-endpoints/models/wan-2-5) combines image and audio to create lifelike videos, while [Nano Banana](/public-endpoints/models/nano-banana-edit) merges multiple images for composite creations.
      </Accordion>

      <Accordion title="August 2025">
        ## Hub revenue sharing launches and Pods UI gets refreshed

        * [Hub revenue share model](/hub/revenue-sharing): Publish to the Runpod Hub and earn credits when others deploy your repo. Earn up to 7% of compute revenue through monthly tiers with credits auto-deposited into your account.
        * [Pods UI updated](/pods/overview): Refreshed modern interface for interacting with Runpod Pods.
      </Accordion>

      <Accordion title="July 2025">
        ## Public Endpoints arrive, Slurm Clusters in beta

        * [Public Endpoints](/public-endpoints/overview): Access state-of-the-art AI models through simple API calls with an integrated playground. Available endpoints include [Qwen Image Edit](/public-endpoints/models/qwen-image-edit), [Flux Kontext](/public-endpoints/models/flux-kontext-dev), [Cogito 671B](/public-endpoints/models/cogito-671b), and [Minimax Speech](/public-endpoints/models/minimax-speech).
        * [Slurm Clusters (beta)](/instant-clusters/slurm-clusters): Create on-demand multi-node clusters instantly with full Slurm scheduling support.
      </Accordion>

      <Accordion title="June 2025">
        ## S3-compatible storage and updated referral program

        * [S3-compatible API for network volumes](/storage/s3-api): Upload and retrieve files from your network volumes without compute using AWS S3 CLI or Boto3. Integrate Runpod storage into any AI pipeline with zero-config ease and object-level control.
        * [Referral program revamp](/references/referrals): Updated rewards and tiers with clearer dashboards to track performance.
      </Accordion>

      <Accordion title="May 2025">
        ## Port labeling, price drops, Runpod Hub, and Tetra beta test

        * [Port labeling](/pods/overview): Name exposed ports in the UI and API to help team members identify services like Jupyter or TensorBoard.
        * [Price drops](/pods/pricing): Additional price reductions on popular GPU SKUs to lower training and inference costs.
        * [Runpod Hub](/hub/overview): A curated catalog of one-click endpoints and templates for deploying community projects without starting from scratch.
        * **Tetra beta test**: A Python library for running code on GPU with Runpod. Add a `@remote()` decorator to functions that need GPU power while the rest of your code runs locally.
      </Accordion>

      <Accordion title="April 2025">
        ## GitHub login, RTX 5090s, and global networking expansion

        * **Login with GitHub**: OAuth sign-in and linking for faster onboarding and repo-driven workflows.
        * **RTX 5090s on Runpod**: High-performance RTX 5090 availability for cost-efficient training and inference.
        * [Global networking expansion](/pods/networking): Rollout to additional data centers approaching full global coverage.
      </Accordion>

      <Accordion title="March 2025">
        ## Enterprise features arrive, REST API goes GA, Instant Clusters in beta, and APAC expansion

        * [CPU Pods get network storage access](/storage/network-volumes): GA support for network volumes on CPU Pods for persistent, shareable storage.
        * **SOC 2 Type I certification**: Independent attestation of security controls for enterprise readiness.
        * [REST API release](/api-reference/overview): REST API GA with broad resource coverage for full infrastructure-as-code workflows.
        * [Instant Clusters](/instant-clusters): Spin up multi-node GPU clusters in minutes with private interconnect and per-second billing.
        * **Bare metal**: Reserve dedicated GPU servers for maximum control, performance, and long-term savings.
        * **AP-JP-1**: New Fukushima region for low-latency APAC access and in-country data residency.
      </Accordion>

      <Accordion title="February 2025">
        ## REST API enters beta with full-time community manager

        * [REST API beta test](/api-reference/overview): RESTful endpoints for Pods, endpoints, and volumes for simpler automation than GraphQL.
        * **Full-time community manager hire**: Dedicated programs, content, and faster community response.
        * [Serverless GitHub integration release](/serverless/workers/github-integration): GA for GitHub-based Serverless deploys with production-ready stability.
      </Accordion>

      <Accordion title="January 2025">
        ## New silicon and LLM-focused Serverless upgrades

        * **CPU Pods v2**: Docker runtime parity with GPU Pods for faster starts with network volume support.
        * [H200s on Runpod](/references/gpu-types): NVIDIA H200 GPUs available for larger models and higher memory bandwidth.
        * [Serverless upgrades](/serverless/overview): Higher GPU counts per worker, new quick-deploy runtimes, and simpler model selection.
      </Accordion>
    </AccordionGroup>
  </Tab>

  <Tab title="2024">
    <AccordionGroup>
      <Accordion title="November 2024">
        ## Global networking expands and GitHub deploys enter beta

        * [Global networking expansion](/pods/networking): Added to CA-MTL-3, US-GA-1, US-GA-2, and US-KS-2 for expanded private mesh coverage.
        * [Serverless GitHub integration beta test](/serverless/workers/github-integration): Deploy endpoints directly from GitHub repos with automatic builds.
        * **Scoped API keys**: Least-privilege tokens with fine-grained scopes and expirations for safer automation.
        * **Passkey auth**: Passwordless WebAuthn sign-in for phishing-resistant account access.
      </Accordion>

      <Accordion title="August 2024">
        ## Storage expansion and private cross-data-center connectivity

        * [US-GA-2 added to network storage](/storage/network-volumes): Enable network volumes in US-GA-2.
        * [Global networking](/pods/networking): Private cross-data-center networking with internal DNS for secure service-to-service traffic.
      </Accordion>

      <Accordion title="July 2024">
        ## Storage coverage grows with major price cuts and revamped referrals

        * **US-TX-3 and EUR-IS-1 added to network storage**: Network volumes available in more regions for local persistence.
        * **Runpod slashes GPU prices**: Broad GPU price reductions to lower training and inference total cost of ownership.
        * [Referral program revamp](/references/referrals): Updated commissions and bonuses with an affiliate tier and improved tracking.
      </Accordion>

      <Accordion title="May 2024">
        ## \$20M seed round, community event, and broader Serverless options

        * **\$20M seed by Intel Capital and Dell Technologies Capital**: Funds infrastructure expansion and product acceleration.
        * **First in-person hackathon**: Community projects, workshops, and real-world feedback.
        * [Serverless CPU Pods](/references/cpu-types): Scale-to-zero CPU endpoints for services that don't need a GPU.
        * [AMD GPUs](/references/gpu-types): AMD ROCm-compatible GPU SKUs as cost and performance alternatives to NVIDIA.
      </Accordion>

      <Accordion title="February 2024">
        ## CPU compute and first-class automation tooling

        * **CPU Pods**: CPU-only instances with the same networking and storage primitives for cheaper non-GPU stages.
        * [runpodctl](/runpodctl/overview): Official CLI for Pods, endpoints, and volumes to enable scripting and CI/CD workflows.
      </Accordion>

      <Accordion title="January 2024">
        ## Console navigation overhaul and documentation refresh

        * **New navigational changes to Runpod UI**: Consolidated menus, consistent action placement, and fewer clicks for common tasks.
        * **Docs revamp**: New information architecture, improved search, and more runnable examples and quickstarts.
        * **Zhen AMA**: Roadmap Q\&A and community feedback session.
      </Accordion>
    </AccordionGroup>
  </Tab>

  <Tab title="2023">
    <AccordionGroup>
      <Accordion title="December 2023">
        ## New regions and investment in community support

        * **US-OR-1**: Additional US region for lower latency and more capacity in the Pacific Northwest.
        * **CA-MTL-1**: New Canadian region to improve latency and meet in-country data needs.
        * **First community manager hire**: Dedicated community programs and faster feedback loops.
        * **Building out the support team**: Expanded coverage and expertise for complex issues.
      </Accordion>

      <Accordion title="October 2023">
        ## Faster template starts and better multi-region hygiene

        * **Serverless quick deploy**: One-click deploy of curated model templates with sensible defaults.
        * **EU domain for Serverless**: EU-specific domain briefly offered for data residency, superseded by other region controls.
        * **Data-center filter for Serverless**: Filter and manage endpoints by region for multi-region fleets.
      </Accordion>

      <Accordion title="September 2023">
        ## Self-service upgrades, clearer metrics, new pricing model, and cost visibility

        * **Self-service worker upgrade**: Rebuild and roll workers from the dashboard without support tickets.
        * **Edit template from endpoint page**: Inline edit and redeploy the underlying template directly from the endpoint view.
        * **Improved Serverless metrics page**: Refinements to charts and filters for quicker root-cause analysis.
        * [Flex and active workers](/serverless/pricing): Always-on "active" workers for baseline load with on-demand "flex" workers for bursts.
        * **Billing explorer**: Inspect costs by resource, region, and time to identify optimization opportunities.
      </Accordion>

      <Accordion title="August 2023">
        ## Team governance, storage expansion, and better debugging

        * [Teams](/get-started/manage-accounts): Organization workspaces with role-based access control for Pods, endpoints, and billing.
        * [Savings plans](/pods/pricing): Plans surfaced prominently in console with easier purchase and management for steady usage.
        * **Network storage to US-KS-1**: Enable network volumes in US-KS-1 for local, persistent data workflows.
        * [Serverless log view](/serverless/development/logs): Stream worker stdout and stderr in the UI and API for real-time debugging.
        * **Serverless health endpoint**: Lightweight /health probe returning endpoint and worker status without creating a billable job.
        * **SOC 2 Type II compliant**: Security and compliance certification for enterprise customers.
      </Accordion>

      <Accordion title="June 2023">
        ## Observability, top-tier GPUs, and commitment-based savings

        * **Serverless metrics page**: Time-series charts for pXX latencies, queue delay, throughput, and worker states for faster debugging and tuning.
        * [H100s on Runpod](/references/gpu-types): NVIDIA H100 instances for higher throughput and larger model footprints.
        * [Savings plans](/pods/pricing): Commitment-based discounts for predictable workloads to lower effective hourly rates.
      </Accordion>

      <Accordion title="May 2023">
        ## Smoother auth and multi-region Serverless with persistent storage

        * **The new and improved Runpod login experience**: Streamlined sign-in and team access for faster, more consistent auth flows.
        * [Network volumes added to Serverless](/storage/network-volumes): Attach persistent storage to Serverless workers to retain models and artifacts across restarts and speed cold starts through caching.
        * **Serverless region support**: Pin or allow specific regions for endpoints to reduce latency and meet data-residency needs.
      </Accordion>

      <Accordion title="April 2023">
        ## Deeper autoscaling controls, richer metrics, persistent storage, and job cancellation

        * **Serverless scaling strategies**: Scale by queue delay and/or concurrency with min/max worker bounds to balance latency and cost.
        * **Queue delay**: Expose time-in-queue as a first-class metric to drive autoscaling and SLO monitoring.
        * **Request count**: Track success and failure totals over windows for quick health checks and alerting.
        * **runsync**: Synchronous invocation path that returns results in the same HTTP call for short-running jobs.
        * **Network storage beta**: Region-scoped, attachable volumes shareable across Pods and endpoints for model caches and datasets.
        * **Job cancel API**: Programmatically terminate queued or running jobs to free capacity and enforce client timeouts.
      </Accordion>

      <Accordion title="April 1, 2023">
        ## Serverless platform hardens with cleaner API

        * **Serverless API v2**: Revised request and response schema with improved error semantics and new endpoints for better control over job lifecycle and observability.
      </Accordion>

      <Accordion title="February 1, 2023">
        ## Better control over notifications and GPU allocation

        * **Notification preferences**: Configure which platform events trigger alerts to reduce noise for teams and CI systems.
        * **GPU priorities**: Influence scheduling by marking workloads as higher priority to reduce queue time for critical jobs.
      </Accordion>
    </AccordionGroup>
  </Tab>

  <Tab title="2022">
    <AccordionGroup>
      <Accordion title="July 1, 2022">
        ## Encrypted volumes for persistent data

        * **Runpod now offers encrypted volumes**: Enable at-rest encryption for persistent volumes with no application changes required using platform-managed keys.
      </Accordion>
    </AccordionGroup>
  </Tab>
</Tabs>
