Catalog Discovery Service

Welcome to the documentation for the Catalogue Discovery Service (CDS) – a modular, scalable, and intelligent discovery layer for Beckn-enabled networks. This guide is intended for users, adopters, integrators, and network facilitators looking to understand and adopt CDS in their ecosystem.


🌟 What is CDS?

The Catalogue Discovery Service (CDS) is a Beckn specifications-compliant, modular SaaS product designed to power intelligent discovery experiences across Beckn-enabled digital networks. It enables seekers — whether apps, platforms, assistants, or autonomous agents — to find and interact with relevant products, services, opportunities, and offerings from diverse providers.

Rather than being just a search engine, CDS is a semantically aware, agentic-AI ready discovery layer that supports structured, URL-based search, and natural language-based queries. It abstracts the complexity of decentralized, heterogeneous provider ecosystems, and offers a seamless interface to search across distributed catalogues. It’s designed to work across sectors (mobility, commerce, health, logistics, and more) and use cases (B2C, B2B, public services).

CDS enables:

  • Natural language-driven discovery (via MCPs and NLP)

  • Browser-based lookup (via URL query specs)

  • Structured API-based search (Beckn-compliant /search)

  • Semantic expansion (leveraging schemas, tags, linked relationships)

  • Ranking, personalization, and trust filtering (coming soon)

CDS fulfills three core roles:

🗂️ Catalogue Store

At the heart of CDS lies a high-performance, indexed catalogue store. This is not just a passive data cache — it performs key functions critical to the lifecycle of distributed discovery.

CDS supports both push- and pull-based catalogue ingestion, allowing providers to publish their offerings in the way that suits them best. Key functions include:

  • Catalogue publishing:Providers can upload, push via APIs, or expose catalogues via syncable URLs or endpoints — CDS can poll or fetch these catalogues periodically (pull-based ingestion), enabling participation even from low-tech providers.

  • Validation: Incoming catalogues are checked for schema conformance, semantic integrity, and discoverability readiness.

  • Time-to-Live (TTL) and freshness tracking: Every entry in the store can be assigned a TTL, ensuring that seekers don’t encounter stale information.

  • Versioning and replacement: Catalogue entries are de-duplicated, versioned, and updated efficiently based on provider-specified policies.

This store enables sub-second response times for most common discovery queries — even when dealing with tens of thousands of items and hundreds of providers.

🔁 Gateway

The Gateway module acts as a smart fan-out mechanism that sends discovery requests to live Beckn-compatible provider systems — known as Beckn Provider Platforms (BPPs) — when real-time data is required. This is especially useful in cases where:

  • Availability is dynamic (e.g., ride availability, service slots)

  • Inventory changes frequently (e.g., in food delivery or quick commerce)

  • Pricing is contextual (e.g., surge pricing, offer eligibility)

The gateway ensures that live responses from BPPs can be collected, normalized, and blended with cached catalogue results, forming a unified, timely response for seekers. It supports timeout handling, partial response rendering, and resilience against delayed or failed responses.

🧠 Catalogue Search Engine

This is the intelligent heart of CDS. The search engine:

  • Interprets incoming queries (structured, unstructured, or hybrid)

  • Identifies which parts of the catalogue (or live APIs) are relevant

  • Matches seeker criteria with provider catalogue entries using both exact and semantic matches

  • Applies fusion strategies to combine cached and real-time data

  • Applies ranking and filtering rules to organize results meaningfully

The search engine is also extensible: scoring logic, relevance signals, personalization models, and domain-specific rules can be plugged in as modular components. Over time, it will support ML-based ranking, personalized scoring, and adaptive query refinement.


🌍 Why CDS Matters for Beckn Networks

In any Beckn-enabled digital network, the ability to discover relevant offerings — services, items, opportunities — quickly, intelligently, and reliably is fundamental. Whether it’s a commuter finding a nearby EV, a shopper browsing across small vendors in a city, or a citizen searching for clinics in their locality, discovery is the first and most critical touchpoint.

While the Beckn Protocol defines how APIs should work across seekers and providers, it doesn’t prescribe how seekers efficiently find providers or how provider inventories stay searchable, verifiable, or up to date. This is where the Catalogue Discovery Service (CDS) steps in — as a shared, intelligent discovery layer designed to make decentralized commerce actually usable, inclusive, and efficient at scale.

⚠️ The Problem Without CDS

In the absence of a shared discovery layer like CDS, ecosystems routinely face several technical and operational challenges:

  • Fragmented Catalogue Visibility Providers may expose their catalogues in different formats, through varied interfaces, or sometimes not at all. There’s no common way for seekers to know what’s available or reliable.

  • Stale or Incomplete Results Without TTLs or freshness strategies, seekers may receive outdated offerings or incomplete listings — leading to broken user experiences and low trust.

  • No Relevance or Ranking Beckn-compliant results are structurally uniform but not prioritized. Without a shared ranking engine, seekers receive unfiltered, flat lists that don’t consider distance, trust, price, or availability.

  • Lack of Observability Facilitators and public authorities cannot easily monitor how discovery is working across the network — who’s discoverable, how catalogues perform, or what users are searching for.

✅ How CDS Solves These Challenges

The Catalogue Discovery Service addresses these systemic gaps by providing a high-quality, extensible, and domain-neutral discovery infrastructure that sits between seekers and providers.

🔗 A Single Integration for Seekers

Seekers — whether apps, platforms, agents, or assistants — only need to integrate once with CDS. The service offers structured API queries, browser-based lookup, and natural language support — abstracting the complexity of fan-out logic, result fusion, provider variability, and ranking.

📡 Seamless Provider Onboarding

Providers can participate in discovery via multiple ingestion mechanisms: push APIs, upload portals, scheduled polling, or even smart scraping. This makes CDS inclusive of both highly digital and low-digital-capability providers.

⚡ Fast, Reliable, and Context-Aware Results

CDS combines cached catalogue responses and live provider APIs using a smart fusion engine. This allows seekers to get fast results when possible, but fall back to fresh data when required — with graceful degradation.

🎯 Built-in Ranking and Policy-Aware Filtering

CDS supports pluggable ranking logic — which can consider provider reputation, pricing, distance, availability, or ecosystem-specific rules. This turns flat data into meaningful discovery experiences.

📊 System-wide Observability

CDS dashboards, logs, and APIs help facilitators monitor discovery quality, provider freshness, ingestion success, and search demand — enabling data-driven ecosystem governance.

🧑‍🤝‍🧑 Who Needs CDS — and What They Gain

Stakeholder
Why CDS Matters to Them
Key Benefits

Seeker-facing Platforms

Require a seamless, highly responsive, smart way to discover relevant items — across domains, providers, and locations.

🔍 Faster, more relevant, and structured discovery across providers with just one API integration.

Providers

Need visibility in decentralized marketplaces, along with multiple flexible ways to publish, update, and manage catalogues.

📢 Greater visibility, easier onboarding, and control over how their offerings appear and stay discoverable.

Network Facilitators

Must ensure consistent, high-quality discovery across all nodes in the network — spanning verticals and geographies.

🧰 Out-of-the-box discovery layer that supports scalability, policy enforcement, and domain-specific curation.

Policymakers & Govts

Use CDS to support interoperable, inclusive public service discovery and monitor market health via open observability.

📈 Transparent, inclusive, and standards-driven access to services for citizens with better analytics.


🛠️ Key Features & Capabilities

The Catalogue Discovery Service (CDS) delivers a rich set of capabilities to support seamless discovery across decentralized networks. These capabilities are grouped by stakeholder experience — enabling seekers to discover meaningfully, providers to participate with minimal friction, and network operators to manage and observe the ecosystem.

🔍 Seeker-Facing Capabilities

1. Structured Discovery via Beckn Search API

Seekers (apps, platforms, assistants) can initiate discovery using a /search API defined by the Beckn Protocol. These structured queries can include filters like location, service category, item tags, price ranges, and more. CDS handles routing, fan-out to relevant providers, and collating responses into a unified, standards-compliant on_search payload.

2. Browser-Based Discovery

For lightweight seekers or citizen-facing portals, CDS supports query initiation through a standard browser link. This can be embedded in websites, QR codes, or search widgets — allowing easy discovery access without needing an app or API integration.

3. Natural Language Discovery (via MCP)

CDS supports natural language inputs — either by accepting them directly or integrating with a Model Capability Provider (MCP) for query translation. Seekers can send queries like “Find labs open after 7pm near MG Road,” which are resolved into structured Beckn-compatible search requests.

This can be implemented:

  • Client-side: via Beckn-ONIX adaptors or SDKs

  • Server-side: as part of CDS’s own processing logic (optional service)

4. Smart Fusion Engine

The smart fusion engine determines whether to serve results from:

  • Cached provider catalogues (for speed and reliability)

  • Live provider APIs (for real-time freshness)

  • Or a hybrid of both

This decision is made dynamically based on query type, provider configuration, TTL settings, and domain logic.

5. Result Aggregation and Ranking

Once responses are collected, CDS applies ranking and sorting logic to return results that are relevant, contextual, and useful to end users. Ranking can be based on:

  • Proximity

  • Availability

  • Provider rating

  • Network-specific scoring logic (e.g., affordability, eco-friendliness)

This avoids flat, unprioritized listings and improves usability dramatically.

📦 Provider-Facing Capabilities

1. Real-Time Discovery Participation

Providers with live Beckn Provider Platforms (BPPs) can receive discovery requests in real-time and respond with dynamic availability, pricing, or stock levels. CDS handles the request fan-out and response collation.

2. Push-Based Catalogue Ingestion

Providers can push catalogue data to CDS using secure ingestion APIs. Supported formats include JSON (aligned with Beckn item schema) or CSV. CDS validates incoming catalogues for structure, completeness, and semantic readiness before indexing.

3. Upload Portal for Low-Digital Providers (Coming Soon)

CDS includes a user-friendly upload portal where providers — especially those with limited technical capabilities — can manually submit their offerings. Over time, this will include assisted ingestion from PDFs, scanned brochures, or even spoken input (AI-assisted ingestion).

4. Scheduled Catalogue Polling (Coming Soon)

For providers who host catalogues online or on semi-static endpoints, CDS can periodically poll and ingest updated information. This ensures catalogue freshness without requiring push capabilities from providers.

5. NLP/MCP-assisted Web Extraction (Coming Soon)

In cases where structured APIs are absent, CDS can use semantic crawling, schema.org tags, or NLP tools to extract catalogue data from provider websites or product feeds. This broadens the inclusion surface area for smaller or legacy providers.

6. TTL & Freshness Control (Coming Soon)

Providers can define how long their catalogue entries should remain valid (TTL). CDS respects these TTLs during fusion decisions and can notify providers when updates are needed — improving trust in discovery results.

7. Discovery Participation Preferences (Coming Soon)

Providers can indicate preferences — e.g., only participate in proximity searches, exclude price-based filters, or restrict discovery during certain hours. This gives them control without compromising discoverability.

🧑‍💼 Admin & Operator-Facing Capabilities

1. Discovery Traffic Dashboard (Coming Soon)

CDS exposes visual dashboards showing search query volumes, popular categories, peak times, and provider responsiveness. This helps network facilitators optimize performance and user experience.

2. Catalogue Ingestion Logs & Audits

Every ingestion event (push, poll, upload, crawl) is logged with metadata: timestamp, provider ID, catalogue size, errors (if any), and ingestion duration. These logs support transparency, compliance, and support workflows.

3. Alerting & Monitoring (Coming Soon)

CDS enables configuration of alerts — for example, if a provider’s TTL expires without update, or if real-time API response latency crosses a threshold. Alerts can be pushed to dashboards, Slack, or external monitoring systems.

4. Multi-Tenant & Domain Configuration

Network operators can run multiple verticals (e.g., mobility, retail, health) as separate tenants within the same CDS instance. Each tenant can have distinct configurations: ranking logic, branding, default filters, ingestion rules, and monitoring.

5. Role-Based Access Control (RBAC)

Admin roles can be defined with granular permissions — e.g., ingestion managers, dashboard viewers, network policy editors — ensuring secure and auditable operations across distributed teams.


🧱 Design Principles

The Catalogue Discovery Service (CDS) has been designed with the express goal of serving as a robust, evolvable, and inclusive discovery layer for Beckn-based digital networks. Its design principles reflect the realities of building public infrastructure that must support scale, interoperability, and adaptability — while being accessible to a wide range of actors with varying digital maturity.

Each principle below informs how CDS is designed, developed, and deployed in real-world ecosystems.

🔄 Scalability & Performance

CDS is designed to operate under high query volumes and ingest large volumes of catalogues from thousands of providers across multiple domains. Whether a city-wide commerce network or a national health registry, CDS is intended to ensure low-latency responses and consistent throughput — making it suitable for mission-critical public services as well as high-frequency commercial applications.

🧩 Modularity & Extensibility

Every subsystem in CDS — from the fusion engine to ranking logic and ingestion flows — is designed to be modular. This means adopters can extend or replace specific components without touching the rest of the system. For example, a mobility network could introduce a proximity-weighted scoring plugin, while a health network could add compliance-based filtering — all without breaking the core discovery flow.

📐 Standards Compliance

CDS is designed to strictly adhere to the Beckn Protocol and also supports established open standards like OpenAPI and schema.org. This ensures seamless interoperability with Beckn registries, provider platforms (BPPs), and external tooling — reducing vendor lock-in and making it easier for developers and policymakers to collaborate across the stack.

⏱️ Data Freshness & TTL Control

To ensure seekers don’t see outdated information, CDS incorporates mechanisms for enforcing Time-To-Live (TTL) on every catalogue item. TTL policies are intended to be configurable by providers or network operators. CDS uses these policies to refresh stale data, trigger re-ingestion, or exclude expired entries from search results — preserving the relevance and trustworthiness of discovery.

🔎 Observability & Debuggability

CDS is designed to provide built-in dashboards, logs, and alerting systems that allow operators to understand what’s happening inside — from ingestion failures to slow provider responses to query volumes per domain. This observability is essential not just for debugging but also for governance, audits, and ecosystem analytics.

🏢 Multi-Tenancy & Domain Flexibility

CDS is architected to support multiple tenants — such as different domains (e.g., health, mobility, commerce), geographic zones (e.g., states or cities), or verticals (e.g., B2B vs. B2C) — within a single deployment. Each tenant can have its own branding, TTL rules, ingestion logic, and search ranking preferences. This makes CDS suitable for layered governance models or federated ecosystems.

🔐 Security & Access Control

All interfaces — API endpoints, dashboards, ingestion portals — are secured using token-based authentication and role-based access control (RBAC). Admins can assign scoped access to different roles (e.g., ingestion manager, support, analytics), ensuring only authorized users can modify configurations or access sensitive operational data.

🔧 Low-Code & Assisted Integration Paths

Recognizing that many providers — especially small or informal ones — may lack engineering capacity, CDS is designed to support multiple low-code and no-code ingestion paths. Providers can upload their catalogues via portals, while CDS handles validations and indexing. Future versions will include AI-powered ingestion from PDFs, spoken input, and unstructured web sources — further widening the inclusion net.

✅ Trust via Beckn Registries

CDS is integrated with Beckn registry services to ensure that only verified, authorized entities participate in discovery. Every incoming query or response can be validated against registry entries, enabling trust in decentralized transactions. This helps guard against spoofing, duplication, or misrepresentation — making the discovery layer safer and more compliant.


Perfect — here’s the expanded 🏗️ High-Level Architecture section with:

  1. Narrative overview describing how a discovery query flows through the system

  2. Component-wise breakdown matching the architecture diagram (Seeker, CDS, Provider)


🏗️ High-Level Architecture

The architecture of the Catalogue Discovery Service (CDS) is designed to facilitate seamless, intelligent discovery across distributed Beckn networks. It connects seekers and providers through a shared orchestration and search layer that supports real-time as well as cached responses.

The system is organized into three primary zones: Seeker, CDS Core, and Provider. Each of these zones has clearly defined responsibilities and interfaces.

🔁 Discovery Flow at a Glance

When a discovery request is initiated — whether via structured API, browser link, or natural language query — it is first received by CDS’s Unified Search Interface. From there, the query is passed to the Search Orchestrator, which determines whether to serve the request using cached results, fan-out to live providers, or do both.

If cache-based responses are sufficient (e.g., for generic product listings), the Cache Store is queried directly. If freshness is needed (e.g., ride availability, dynamic pricing), the orchestrator invokes the Fan-out Gateway, which reaches out to live providers via Beckn-compliant APIs.

A Smart Fusion Engine then combines the cached and live responses, deduplicates results, and applies scoring logic. The Ranking Engine orders the results based on configurable rules (e.g., proximity, price, rating), and the Response Builder formats the final payload before sending it back to the seeker.

🧭 Component Breakdown

📥 Seeker

  • API Query Interface: Accepts structured /search requests from seeker applications following the Beckn Protocol.

  • Browser-Based Query Interface: Supports initiating search via pre-defined URL query parameters for lightweight access.

  • Natural Language Query Interface: Accepts free-text queries and passes them for resolution (e.g., via MCP). Optional but powerful for agentic or consumer-centric platforms.

  • Unified Search Interface: Serves as the common entry point into CDS, abstracting over the type of incoming query.

🧠 CDS Core Components

  • Search Orchestrator Coordinates the overall discovery workflow. It determines which downstream services to invoke based on the query, network policy, TTL freshness, and provider availability.

  • Cache Store Stores indexed provider catalogues ingested via APIs, uploads, or polling. Supports fast lookups and is optimized for low-latency responses. TTL and versioning rules ensure data validity.

  • Fan-out Gateway Handles outbound /search requests to Beckn Provider Platforms (BPPs). Manages timeouts, retries, and partial response scenarios to ensure robustness.

  • Smart Fusion Engine Combines cache and real-time responses, deduplicates overlapping results, and normalizes formats where needed. Designed to evolve toward schema-aware and domain-sensitive fusion logic.

  • Ranking Engine Applies scoring and prioritization to matched results. Ranking rules can consider factors like price, proximity, popularity, and can be domain-specific or network-specific.

  • Response Builder Formats and packages the final on_search response for the seeker. Ensures compliance with the Beckn Protocol, including any additional metadata, filters, or tags required.

📦 Provider Interfaces

  • Real-Time Beckn API Exposes /search and /on_search endpoints for providers that support dynamic inventory or availability. Integrated directly with the Fan-out Gateway.

  • Push Ingestion API Allows providers to upload catalogues (CSV, JSON) programmatically. Validated, stored, and indexed in the Cache Store.

  • Polling Sync For providers that expose catalogues passively, CDS can be configured to fetch updates periodically using scheduled sync jobs.

  • NLWeb/MCP Pull (Coming Soon) For semi-structured or unstructured sources (e.g., provider websites), CDS will support extracting catalogue data using semantic crawlers or ML/NLP tools.


🚦 What’s Coming Next?

The CDS team is actively working on several enhancements to improve discovery quality, inclusion, and intelligence across networks:

  • Configurable scoring plugins to support domain-specific result ranking

  • Public-facing metrics APIs for network analytics and observability

  • Enhanced natural language understanding and intent resolution (via MCP integration)

  • AI-assisted ingestion of catalogues from PDFs, images, or voice inputs

  • Improved dashboarding for provider activity and seeker search trends

  • Seamless support for federated multi-node deployments


📘 API Reference

The Catalogue Discovery Service (CDS) exposes a set of Beckn-compliant APIs that enable structured, browser-based, and natural language discovery across domains. These APIs follow the latest Beckn Protocol specifications and are designed to work seamlessly with any Beckn-compliant seeker or provider. Typical workflow:

🔧 Available CDS APIs

POST /beckn/discover

Initiates a discovery query from the seeker using a structured Beckn search payload.

Request Schema:

Uses the SearchMessage structure defined in attributes.yaml, including:

  • context: identifying network ID, domain, BAP/BPP IDs, etc.

  • message.intent: describing the item or service being searched — including item.descriptor.name, fulfillment, payment, tags, and location details

Response Behavior:

This endpoint supports both synchronous and asynchronous discovery models:

  • If policies permit or the query can be resolved from cache, CDS responds immediately with a DiscoverResponse containing matching results.

  • If the request requires fan-out to live BPPs, CDS returns an AckResponse, and the actual results follow asynchronously via the /on_discover callback.

POST /beckn/on_discover

Receives the final discovery response from CDS or the CDS-facilitated fusion of multiple sources.

Response Schema:

Follows the DiscoverResponse schema (defined in attributes.yaml) containing:

  • context: tracking the discovery flow

  • message.catalog: a list of matched providers, items, and fulfillment options

  • Optional metadata such as TTL status, match scores, and ranking rationale

Notes:

This endpoint is typically configured as a callback on the seeker (BAP) platform and should match the context.message_id of the original /discover call.

Enables discovery via simple HTTP GET parameters — useful for embedding discovery flows in web portals, search widgets, or citizen-facing tools.

Query Parameters Include:

  • search_term or item_name

  • location (text or coordinates)

  • domain, category, tags, search_mode (e.g., realtime, cached, or hybrid)

Response Format:

Returns a DiscoverResponse in JSON format — either from cache or post-fusion — allowing lightweight or low-code access to the network’s offerings.

📂 API Definitions and Schema References

CDS is built on top of the Beckn Protocol API spec and uses the core schema definitionsfrom Beckn V2.

The DiscoverRequest and DiscoverResponse structures rely on foundational attributes like:

  • item.descriptor, availableAt

  • provider, fulfillment, payment, etc.

But what makes Beckn V2 powerful — and CDS flexible — is the support for composable, use case–specific compositions.

🔧 Composability in Beckn V2

CDS fully supports the inclusion of use case specific attributes via extensible fields such as:

  • itemAttribute, offerAttribute, orderAttribute, fulfillmentAttribute, etc.

These allow ecosystems to enrich discovery queries and responses with sectoral semantics. For example:

  • Health: Specialties like “pediatrician” or diagnostics like “lab with MRI”

  • Mobility: Environmental tags like “EV-only” or “wheelchair accessible”

  • Finance/Retail: Offers with eligibility conditions, cashback, or bundled services

CDS treats these custom attributes as opaque but indexable — meaning they can be stored, filtered, and even used in result ranking — without requiring structural changes to the platform.

This enables CDS to support highly tailored discovery experiences across domains such as retail, mobility, health, finance, and logistics.


🤝 Ready to Get Started?

The Catalogue Discovery Service (CDS) is available as a plug-and-play SaaS offering — purpose-built to help seekers, network facilitators, and ecosystem builders adopt Beckn with speed, confidence, and minimal technical overhead.

Whether you’re:

  • Building a new Beckn network and need a powerful discovery layer on day one,

  • Expanding an existing network with better catalogue management and intelligent search,

  • Or piloting a city, sector, or service line using Beckn…

CDS gives you a standards-compliant, production-ready solution without needing to build and maintain complex infrastructure yourself.

🔧 What’s Included in the CDS SaaS Offering?

  • Fully managed and hosted CDS instance

  • APIs for /discover, /on_discover, and browser-based discovery

  • Admin console for monitoring ingestion and network health

  • Optional MCP and dashboard extensions

  • Support for multi-tenant, multi-domain configurations

  • Ingestion onboarding assistance for providers

  • Beckn registry integration for trust and network verification

🚀 How to Get Started

We offer a simple onboarding flow for interested partners:

  1. Reach out to request a sandbox or pilot setup

  2. Share your use case(s) and discovery requirements (domain, location, TTL policies, etc.)

  3. Get started with a working discovery layer in days, not months

📬 Contact & Community

  • 📧 Email: [[email protected]]

  • 🌐 Website: [your-cds-saas-site.com]

  • 💬 Join the Community: [Slack / Forum / GitHub Discussions]

  • 🛠️ GitHub: [CDS Repository placeholder]

Whether you’re an app developer, a government agency, a digital public infrastructure architect, or a private-sector platform — CDS is your bridge to rapid, inclusive, and intelligent discovery across Beckn networks.

Let’s build interoperable ecosystems, one discovery at a time. ✨

Last updated