Welcome to PDFSage
60s leftWelcome to PDFSage
PDFSage was founded on the fact that Adobe AcrobatPro is currently a monopoly in the PDF editing, and thus also PDF market. For $10 per month, you get poor UI mobile AcrobatPro on one iOS or Android platform only. After just 24 hours in building the PDF parsing, rendering, and editing engine, Bo Shang has gotten parsing and editing almost fully working for PDF versions 1.4 and 1.7.
About PDFSage
60s leftAbout PDFSage
PDFSage aims to channel ADHD and distraction into the PDF engine only, dropping all other distractions, and bring the PDFSage engine to all platforms ASAP to compete with Adobe's monopoly.
Technical Vision
Technical Vision
PDFSage is developing a PDF engine that integrates specialized AI technologies for parsing, rendering, editing, and automation. Key capabilities include:
- High-precision PDF DOM parsing with malformed document recovery
- Structured rendering engine for real-time manipulations
- Seamless in-document editing powered by text detection and reflow
- Multi-threaded architecture for tasks like annotation, form processing, and data extraction
- Integration of AI for context-aware text editing
- Workflow automation for batch document processing
PDF Structure & Versions
PDF Structure & Versions
The PDF file format comprises objects, cross-reference tables, trailers, and streams. Pages reference fonts, images, and other resources through these objects. Dictionaries store metadata, allowing flexible modification without rewriting the entire file.
- PDF 1.3 – Introduced incremental updates and new compression features.
- PDF 1.4 – Added transparency and enhancements for interactive forms.
- PDF 1.5 – Introduced object streams and compressed cross-reference tables.
- PDF 1.6 – Enhanced encryption and 3D support.
- PDF 1.7 – Standardized as ISO 32000-1, improved metadata.
- PDF 2.0 – ISO 32000-2, clarifications, security updates, and expanded annotations.
PDF DOM Explorer
Select an object to inspect its references, streams, and dictionaries:
Select | Object ID | Type | References | Actions |
---|---|---|---|---|
1 | Page | FontsImages | ||
2 | Font | Encoding | ||
3 | Annotation | Page 1 |
DOM Tree View
- Page (ID: 1) [ Expand ]
- Page (ID: 4) [ Expand ]
- Outline (ID: 6)
AI Methodology
AI Methodology for Next-Generation PDF Editing
Our primary objective is to make PDF editing as intuitive as editing a plain-text or word-processing document. This involves bridging the intricate structures found in PDF files with AI-driven features that recognize the semantics of text, images, and layout.
1. Advanced Layout Analysis: We employ a multi-stage layout processor that detects columns, tables, headers, footers, and other structural elements.
2. Context-Aware Text Editing: Our engine uses language models to detect semantic boundaries, paragraphs, and contextual meaning.
3. Semantic Object Detection: We integrate computer vision techniques to recognize images, shapes, or vector graphics in a PDF.
4. Text Reflow Engine: A specialized reflow subsystem dynamically adjusts paragraphs and text blocks as edits occur.
5. Incremental Update Strategy: Rather than rewriting entire PDFs, we leverage incremental updates.
6. Metadata and Tagging: Our system extracts and maintains rich metadata about fonts, colors, annotations, and accessibility tags.
7. Adaptive Document Conversion: When converting from other formats or scanning in documents, the AI engine attempts to reconstruct logical structure.
8. Research Backbone (2023-2024): The foundation of PDFSage includes ongoing research into NLP-driven layout analysis and improved vector-graphic manipulations.
Detailed Explanation
Detailed Explanation of Parsing, Rendering, and In-Render Editing
1. Parsing Stage: PDFSage reads the PDF file structure and checks for corruption or inconsistencies.
2. Building the PDF DOM: Once objects are identified, we assemble them into a Document Object Model-like structure.
3. Rendering Engine: The rendering engine interprets the PDF DOM. It lays out text, images, and vector graphics.
4. Word-Processor-Like In-Render Editing: PDFSage introduces a reflow mode. When enabled, the user can edit text blocks in a more fluid way.
5. AI-Assisted Reflow with Copilot Support: AI assists real-time editing by suggesting synonyms or corrections.
6. Incremental Updates: We store edits in separate object streams to avoid regenerating the entire file.
7. Conclusion: Our system combines robust performance, deep editing capabilities, and intuitive workflows.
Extended PDF DOM & C++ PDFSage Engine
Extended PDF DOM & C++ PDFSage Engine Architecture
The PDF DOM serves as the central component of PDFSage.
- Nodes and Properties: Each node includes base properties and optional references to streams.
- Hierarchical Structure: Pages contain references to fonts, images, annotations, and other resources.
- Parallel Processing: For large PDFs, we break down rendering tasks by page or object cluster.
- Incremental Updates: The DOM can mark individual nodes as changed for efficient saves.
The PDFSage C++ Engine is designed to run on multiple platforms.
- Modular Architecture: The engine is split into modules for parsing, rendering, editing, AI, and more.
- Dependency Management: We rely on modern C++ standards and manage third-party libraries carefully.
- Rendering Pipeline: The engine constructs display lists from the DOM.
- AI Integration Hooks: Specialized hooks allow the AI layer to inspect the DOM or partially rendered data.
- Thread-Safe DOM: The DOM includes atomic reference counters and concurrency-safe containers.
- Incremental File Updates: When saving, only the modified objects are rewritten.
- Platform-Specific Optimizations: On iOS, the engine leverages Metal for hardware-accelerated rendering.
Memory Management and Object References: Each PDF object is reference-counted.
Error Handling and Corruption Recovery: The engine logs anomalies and attempts to recover from partial corruption.
Handling Advanced PDF Features: We support form fields, embedded files, multimedia annotations, and advanced encryption.
Layers and Transparency Groups: The engine manages optional content groups and transparency layers.
Partial Download and Linearized PDF Handling: For linearized PDFs, pages can render before the entire file is downloaded.
Build Process (simplified example using CMake):
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j8
AI for Reflow Editing: Our AI components evaluate text layout in real time.
Additional C++ Engine Insights: We profile memory usage and CPU load across different workflows.
Contact Information
Contact Information
Phone: +1 781-999-4101
Email: bo@shang.software
Email: b0sh4ng@gmail.com
Email: bo@pdfsage.com
Email: bo@pdfsage.xyz