Content / Uploading content

Uploading content

Once a domain is created you can upload content for indexing. Select a domain from the sidebar, then click Add content . Guides Knowledge AI supports several content formats, each with its own ingestion pipeline.

Upload a .zip archive containing your DITA map and topic files. Guides Knowledge AI parses the DITA structure, resolves conrefs and key references, and indexes each topic as a separate document. Metadata such as shortdesc, prolog keywords, and navigation titles are preserved and used during retrieval.

Provide one or more URLs pointing to public HTML pages. Guides Knowledge AI crawls each URL, extracts the main content (stripping navigation, headers, and footers), and indexes the resulting text. You can also provide a sitemap URL to index an entire site at once.

For PDF and Markdown files, Guides Knowledge AI uses an agentic conversion pipeline. Documents are first parsed and converted into structured DITA using an LLM-driven agent that handles layout detection, table extraction, and section hierarchy inference. The resulting DITA is then indexed through the standard pipeline. You can monitor conversion progress and retry individual files in the Add content view.

FAQ

How do I upload content for indexing after creating a domain?

Select a domain from the sidebar, then click Add content. From there, choose the content format you want to ingest. Guides Knowledge AI supports multiple formats, each with its own ingestion pipeline.

How do I upload DITA content, and what gets indexed?

Upload a .zip archive that contains your DITA map and topic files. Guides Knowledge AI parses the DITA structure, resolves conrefs and key references, and indexes each topic as a separate document. It preserves metadata like shortdesc, prolog keywords, and navigation titles for retrieval.

How do I index HTML content from a website?

Provide one or more URLs that point to public HTML pages. Guides Knowledge AI crawls each URL, extracts the main content while stripping navigation, headers, and footers, and indexes the resulting text. You can also provide a sitemap URL to index an entire site at once.

How does the PDF/Markdown agenting flow work, and where can I monitor or retry conversions?

For PDF and Markdown files, Guides Knowledge AI uses an agentic conversion pipeline that first parses the documents and converts them into structured DITA using an LLM-driven agent. The agent handles layout detection, table extraction, and section hierarchy inference, and the resulting DITA is then indexed through the standard pipeline. You can monitor conversion progress and retry individual files in the Add content view.

Uploading content

DITA▾

HTML▾

PDF / Markdown agenting flow▾

FAQ