The key to understanding TorchServe is to first understand torch-model-archiver which packages model artifacts into a single model archive file (.mar). torch-model-archive needs the following inputs:
Torchscript
Need a model checkpoint file
Eager Mode (more common)
Need a model definition file and a state_dict file.
CLI
The CLI produces a .mar file. Below is an example of archiving an eager mode model.
usage: torch-model-archiver [-h] --model-name MODEL_NAME
[--serialized-file SERIALIZED_FILE]
[--model-file MODEL_FILE] --handler HANDLER
[--extra-files EXTRA_FILES]
[--runtime {python,python2,python3}]
[--export-path EXPORT_PATH]
[--archive-format {tgz,no-archive,default}] [-f]
-v VERSION [-r REQUIREMENTS_FILE]
Torch Model Archiver Tool
optional arguments:
-h, --help show this help message and exit
--model-name MODEL_NAME
Exported model name. Exported file will be named as
model-name.mar and saved in current working directory if no --export-path is
specified, else it will be saved under the export path
--serialized-file SERIALIZED_FILE
Path to .pt or .pth file containing state_dict in case of eager mode
or an executable ScriptModule in case of TorchScript or TensorRT
or a .onnx file in the case of ORT.
--model-file MODEL_FILE
Path to python file containing model architecture.
This parameter is mandatory for eager mode models.
The model architecture file must contain only one
class definition extended from torch.nn.modules.
--handler HANDLER TorchServe's default handler name
or Handler path to handle custom inference logic.
--extra-files EXTRA_FILES
Comma separated path to extra dependency files.
--runtime {python,python2,python3}
The runtime specifies which language to run your inference code on.
The default runtime is "python".
--export-path EXPORT_PATH
Path where the exported .mar file will be saved. This is an optional
parameter. If --export-path is not specified, the file will be saved in the
current working directory.
--archive-format {tgz,no-archive,default}
The format in which the model artifacts are archived.
"tgz": This creates the model-archive in <model-name>.tar.gz format.
If platform hosting TorchServe requires model-artifacts to be in ".tar.gz"
use this option.
"no-archive": This option creates an non-archived version of model artifacts
at "export-path/{model-name}" location. As a result of this choice,
MANIFEST file will be created at "export-path/{model-name}" location
without archiving these model files
"default": This creates the model-archive in <model-name>.mar format.
This is the default archiving format. Models archived in this format
will be readily hostable on native TorchServe.
-f, --force When the -f or --force flag is specified, an existing .mar file with same
name as that provided in --model-name in the path specified by --export-path
will overwritten
-v VERSION, --version VERSION
Model's version
-r REQUIREMENTS_FILE, --requirements-file REQUIREMENTS_FILE
Path to a requirements.txt containing model specific python dependency
packages.
Handler
TorchServe has the following handlers built-in that do post and pre-processing:
image_classifier
object_detector
text_classifier
image_segmenter
You can implement your own custom handler by following these docs. Most of the time you only need to subclass BaseHandler and override preprocess and/or postprocess.
--extra-files ... index_to_name.json:
From the docs:
image_classifier, text_classifier and object_detector can all automatically map from numeric classes (0,1,2…) to friendly strings. To do this, simply include in your model archive a file, index_to_name.json, that contains a mapping of class number (as a string) to friendly name (also as a string).
Serving
After archiving you can start the modeling server:
TorchServe uses default ports 8080 / 8081 / 8082 for REST based inference, management & metrics APIs and 7070 / 7071 for gRPC APIs.
!torchserve --help
usage: torchserve [-h] [-v | --start | --stop] [--ts-config TS_CONFIG]
[--model-store MODEL_STORE]
[--workflow-store WORKFLOW_STORE]
[--models MODEL_PATH1 MODEL_NAME=MODEL_PATH2... [MODEL_PATH1 MODEL_NAME=MODEL_PATH2... ...]]
[--log-config LOG_CONFIG] [--foreground]
[--no-config-snapshots] [--plugins-path PLUGINS_PATH]
Torchserve
optional arguments:
-h, --help show this help message and exit
-v, --version Return TorchServe Version
--start Start the model-server
--stop Stop the model-server
--ts-config TS_CONFIG
Configuration file for model server
--model-store MODEL_STORE
Model store location from where local or default
models can be loaded
--workflow-store WORKFLOW_STORE
Workflow store location from where local or default
workflows can be loaded
--models MODEL_PATH1 MODEL_NAME=MODEL_PATH2... [MODEL_PATH1 MODEL_NAME=MODEL_PATH2... ...]
Models to be loaded using [model_name=]model_location
format. Location can be a HTTP URL or a model archive
file in MODEL_STORE.
--log-config LOG_CONFIG
Log4j configuration file for model server
--foreground Run the model server in foreground. If this option is
disabled, the model server will run in the background.
--no-config-snapshots, --ncs
Prevents to server from storing config snapshot files.
--plugins-path PLUGINS_PATH, --ppath PLUGINS_PATH
plugin jars to be included in torchserve class path
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 7341 100 7341 0 0 108k 0 --:--:-- --:--:-- --:--:-- 108k