GLiNER2: Unified Schema-Based Information Extraction

58 points by apwheele a day ago on hackernews | 11 comments

hbcondo714 | a day ago

There is another version at:

https://github.com/urchade/GLiNER

Looks like it’s still being maintained too?

adsharma | 23 hours ago

Use Gliner2. Much better model.

: hbcondo714 | 7 hours ago
Okay but there is a dependency on gliner1:
https://github.com/fastino-ai/GLiNER2/issues/69

deepsquirrelnet | a day ago

Zero-shot encoder models are so cool. I'll definitely be checking this out.

If you're looking for a zero-shot classifier, tasksource is in a similar vein.

https://huggingface.co/tasksource/ModernBERT-large-nli

iwhalen | a day ago

Very cool stuff. Love the focus on CPU-first.

Would also love to see some throughput numbers on basic VM setup.

Edit: there are some latency numbers in the paper https://arxiv.org/pdf/2507.18546

adsharma | 23 hours ago

Feels like it's written by ML people not following python software engineering practices.

No black, UV or ruff.

Prints messages with emojis to stdout by default.

Makes a connection to hugging face on every import.

https://github.com/fastino-ai/GLiNER2/pull/74

fbilhaut | 17 hours ago

GLiNER is a really great research work. But putting this kind of things in production is just another job. Not trying to do self promotion here, but there are alternatives for this purpose, like gline-rs (https://github.com/fbilhaut/gline-rs). Support of GLiNER 2 models is on the way.

: adsharma | 13 hours ago
Any chance you could wrap this in pyo3? There is a large python market for this.

snthpy | 22 hours ago

This looks great. Thank you!

plaguna | 21 hours ago

Is this only for text I guess? What if the documents are in PDF? What is the recommendation to transform PDF to text?

: akreal | 16 hours ago
Docling: https://github.com/docling-project/docling