PdfTextExtractor
Category: Web
Source: pdf_text_extractor.dart
Classes
PdfExtractionResult
Result of PDF text extraction.
Constructor
dart
PdfExtractionResult({this.text, this.error, this.pageCount = 0})dart
factory PdfExtractionResult.withError(String error)dart
PdfExtractionResult(error: error)Properties
| Property | Type | Description |
|---|---|---|
text | String? | |
error | String? | |
pageCount | int | |
isSuccess | bool get | |
isSuccess | bool get |
PdfTextExtractor
Extracts text from PDF bytes using the pdftotext CLI tool.
pdftotext is part of poppler-utils, available on macOS (brew install poppler), Linux (apt install poppler-utils), and Windows (scoop/choco).
Constructor
dart
PdfTextExtractor({this.timeoutSeconds = 60})Properties
| Property | Type | Description |
|---|---|---|
timeoutSeconds | int |
Methods
static bool isPdfContent(Uint8List bytes)
Check if the PDF magic bytes are present.
static bool isPdfContentType(String contentType)
Check if a content-type header indicates PDF.
static Future<bool> checkPdftotextAvailable()
Check whether pdftotext is available on this system.
Future<PdfExtractionResult> extract(Uint8List bytes)
Extract text from PDF [bytes] using pdftotext.
Writes bytes to a temp file, runs pdftotext, reads output, cleans up.