Datalab Open-Sources Lift: 9B Model for Document Data Extraction
Datalab open-sourced Lift — a 9B model for extracting structured data from documents.
According to the developers, the model achieves 90.2% accuracy on their benchmark versus 91.3% for Gemini 3.5 Flash and noticeably outperforms specialized open-source solutions like NuExtract3 (81.5%).
Lift can extract data according to a JSON Schema, and the median processing time is 9.5 seconds.
To run it, all you need is: pip install lift-pdf