Moondream2
Moondream2 is a multimodal model that supports image captioning, zero-shot object detection, point-prompt detection, and visual question answering.
You can deploy Moondream2 with Inference.
Installation¶
To install inference with the extra dependencies necessary to run Moondream2, run
pip install inference[transformers]
or
pip install inference-gpu[transformers]
How to Use Moondream2¶
Create a new Python file called app.py
and add the following code:
from PIL import Image
from inference.models.moondream2.moondream2 import Moondream2
pg = Moondream2(api_key="API_KEY")
image = Image.open("dog.jpeg")
prompt = "How many dogs are in this image?"
result = pg.query(image, prompt)
print(result)
In this code, we load Moondream2 run Moondream2 on an image, and annotate the image with the predictions from the model.
Above, replace:
prompt
with the prompt for the model.image.jpeg
with the path to the image that you want to run inference on.
To use Moondream2 with Inference, you will need a Roboflow API key. If you don't already have a Roboflow account, sign up for a free Roboflow account.
Then, run the Python script you have created:
python app.py
The result from your model will be printed to the console.