No description
- TypeScript 98.2%
- CSS 1.2%
- HTML 0.6%
|
|
||
|---|---|---|
| components | ||
| contexts | ||
| hooks | ||
| i18n | ||
| public | ||
| services | ||
| utils | ||
| .gitignore | ||
| App.tsx | ||
| Created | ||
| index.css | ||
| index.html | ||
| index.tsx | ||
| metadata.json | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
| Smart E-commerce_ Integrazione AI e nuove esperienze d'acquisto_Proposta progettuale.pdf | ||
| tsconfig.json | ||
| types.ts | ||
| vite.config.ts | ||
LLM OCR Studio
This is a modern web interface for OCR using Vision LLMs.
🚀 Quick Start
- Install dependencies:
npm install
npm run dev
Open your browser at http://localhost:3000.
2. Configure LLM
The application runs entirely in the browser but needs an OpenAI-compatible API to process images. The interface will warn you if the configuration is missing.
Click the Settings icon in the top right to configure:
- LM Studio (Local):
- Base URL:
http://localhost:1234/v1 - Model: Enter the ID of the model loaded in LM Studio (e.g.,
llama-3.2-vision). - API Key: Leave empty.
- Base URL:
- OpenRouter:
https://openrouter.ai/api/v1+ API Key + Model Name. - Any OpenAI Compatible Endpoint.
3. Usage
- Trascina un PDF nell'area di upload.
- Puoi trascinare più file o aggiungerne altri nella schermata di revisione (Batch processing).
- The app converts files to images (Client-side).
- Sends images to the configured Vision LLM.
- Returns formatted Markdown.
🛠 Requisiti di Sistema
- Node.js: v18+
- API Access: Access to a Vision model (e.g., GPT-4o, Gemini 1.5, Qwen-VL via LMStudio).