Florence-2
Florence-2 is a multimodal model developed by Microsoft Research.
You can use Florence-2 for:
- Object detection: Identify the location of all objects in an image. (
<OD>
) - Dense region captioning: Generate dense captions for all identified regions in an image. (
<DENSE_REGION_CAPTION>
) - Image captioning: Generate a caption for a whole image. (
<CAPTION>
for a short caption,<DETAILED_CAPTION>
for a more detailed caption, and<MORE_DETAILED_CAPTION>
for an even more detailed caption) - Region proposal: Identify regions where there are likely to be objects in an image. (
<REGION_PROPOSAL>
) - Phrase grounding: Identify the location of objects that match a text description. (
<CAPTION_TO_PHRASE_GROUNDING>
) - Referring expression segmentation: Identify a segmentation mask that corresponds with a text input. (
<REFERRING_EXPRESSION_SEGMENTATION>
) - Region to segmentation: Calculate a segmentation mask for an object from a bounding box region. (
<REGION_TO_SEGMENTATION>
) - Open vocabulary detection: Identify the location of objects that match a text prompt. (
<OPEN_VOCABULARY_DETECTION>
) - Region to description: Generate a description for a region in an image. (
<REGION_TO_DESCRIPTION>
) - Optical Character Recognition (OCR): Read the text in an image. (
<OCR>
) - OCR with region: Read the text in a specific region in an image. (
<OCR_WITH_REGION>
)
You can use Inference for all the Florence-2 tasks above.
The text in the parentheses are the task prompts you will need to use each task.
How to Use Florence-2ΒΆ
Install inference
To install inference
with Florence 2 support use the following command on CPU machine:
pip install inference[transformers]
or the following one for GPU machine:
pip install inference-gpu[transformers]
Create a new Python file called app.py
and add the following code:
from inference import get_model
model = get_model("florence-2-base", api_key="API_KEY")
result = model.infer(
"https://media.roboflow.com/inference/seawithdock.jpeg",
prompt="<CAPTION>",
)
print(result[0].response)
Above, replace <CAPTION>
with the name of the task you want to use.
Replace API_KEY
with your Roboflow API key. Learn how to retrieve your Roboflow API key
To use PaliGemma with Inference, you will need a Roboflow API key. If you don't already have a Roboflow account, sign up for a free Roboflow account.
Then, run the Python script you have created:
python app.py
The result from your model will be printed to the console.