-
Deploying state-of-the-art object detectors (DETRs) to AWS.
Forget about Yolo. Transformer-based models are better now, and easy to deploy!
-
ScreenshotQuery, make queries to screenshots using Vision Language Models.
Describe and talk with your set of screenshots
-
Three random scripts.
I made some cool little code snippets that I like it to share.
-
The future of OCR is not to use OCR!
End-to-end document understanding will be huge.
-
Caution while using Large Language Models
Weird failures and behavior from LLMs.