Modes of Running Workflows

Workflows can be executed in local environment, or remote environment can be used. local means that model steps will be executed within the context of process running the code. remote will re-direct model steps into remote API using HTTP requests to send images and get predictions back.

When workflows are used directly, in Python code - compile_and_execute(...) and compile_and_execute_async(...) functions accept step_execution_mode parameter that controls the execution mode.

Additionally, max_concurrent_steps parameter dictates how many steps in parallel can be executed. This will improve efficiency of remote execution (up to the limits of remote API capacity) and can improve local execution if model_manager instance is capable of running parallel requests (only using extensions from inference.enterprise.parallel).

There are environmental variables that controls workflows behaviour:

  • DISABLE_WORKFLOW_ENDPOINTS - disabling workflows endpoints from HTTP API
  • WORKFLOWS_STEP_EXECUTION_MODE - with values local and remote allowed to control how workflows are executed in inference HTTP container
  • WORKFLOWS_REMOTE_API_TARGET - with values hosted and self-hosted allowed - to point API to be used in remote execution mode
  • LOCAL_INFERENCE_API_URL will be used if WORKFLOWS_REMOTE_API_TARGET=self-hosted and WORKFLOWS_STEP_EXECUTION_MODE=remote
  • WORKFLOWS_MAX_CONCURRENT_STEPS - max concurrent steps to be allowed by workflows executor
  • WORKFLOWS_REMOTE_EXECUTION_MAX_STEP_BATCH_SIZE - max batch size for requests into remote API made when remote execution mode is chosen
  • WORKFLOWS_REMOTE_EXECUTION_MAX_STEP_CONCURRENT_REQUESTS - max concurrent requests to be possible in scope of single step execution when remote execution mode is chosen