Modes of Running Workflows
Workflows can be executed in local
environment, or remote
environment can be used. local
means that model steps
will be executed within the context of process running the code. remote
will re-direct model steps into remote API
using HTTP requests to send images and get predictions back.
When workflows
are used directly, in Python code - compile_and_execute(...)
and compile_and_execute_async(...)
functions accept step_execution_mode
parameter that controls the execution mode.
Additionally, max_concurrent_steps
parameter dictates how many steps in parallel can be executed. This will
improve efficiency of remote
execution (up to the limits of remote API capacity) and can improve local
execution
if model_manager
instance is capable of running parallel requests (only using extensions from
inference.enterprise.parallel
).
There are environmental variables that controls workflows
behaviour:
DISABLE_WORKFLOW_ENDPOINTS
- disabling workflows endpoints from HTTP APIWORKFLOWS_STEP_EXECUTION_MODE
- with valueslocal
andremote
allowed to control howworkflows
are executed ininference
HTTP containerWORKFLOWS_REMOTE_API_TARGET
- with valueshosted
andself-hosted
allowed - to point API to be used inremote
execution modeLOCAL_INFERENCE_API_URL
will be used ifWORKFLOWS_REMOTE_API_TARGET=self-hosted
andWORKFLOWS_STEP_EXECUTION_MODE=remote
WORKFLOWS_MAX_CONCURRENT_STEPS
- max concurrent steps to be allowed byworkflows
executorWORKFLOWS_REMOTE_EXECUTION_MAX_STEP_BATCH_SIZE
- max batch size for requests into remote API made whenremote
execution mode is chosenWORKFLOWS_REMOTE_EXECUTION_MAX_STEP_CONCURRENT_REQUESTS
- max concurrent requests to be possible in scope of single step execution whenremote
execution mode is chosen