RESTful API¶
Nidaba includes a RESTful API server and an experimental web user interface. To start up the server locally just run:
$ nidaba api_server
To create batches remotely you can use the normal nidaba
commands by adding
the -h/--host
option:
$ nidaba batch -h http://127.0.0.1:8080/api/v1 --grayscale -l tesseract -o tesseract:languages=eng,extended=True -- input.tif
or:
$ nidaba status -h http://127.0.0.1:8000/api/v1 cf644c49-01b9-44e3-82fc-a4073f0980ef
Schema¶
All data is sent and received as JSON.
Client Errors¶
HTTP Verbs¶
Where possible, the API strives to use appropriate HTTP verbs for each action.
API Reference¶
-
GET
/api/v1/tasks/
(group)/
(task)¶
-
GET
/api/v1/tasks/
(group)¶
-
GET
/api/v1/tasks
¶ Retrieves the list of available tasks, their arguments and valid values for those arguments.
** Request **
GET /tasks
** Response **
HTTP/1.1 200 OK { "img": { "deskew": {}, "dewarp": {}, "rgb_to_gray": {} }, "binarize": { "nlbin": { "border": "float", "escale": "float", "high": [ 0, 100 ], "low": [ 0, 100 ], }, "otsu": {}, "sauvola": { "factor": [ 0.0, 1.0 ], "whsize": "int" } }, "segmentation": { "kraken": {}, "tesseract": {} }, "ocr": { "kraken": { "model": [ "fraktur.pyrnn.gz", "default", "teubner" ] }, "tesseract": { "extended": [ false, true ], "languages": [ "chr", "chi_tra", "ita_old", "ceb", ] } }, "postprocessing": { "spell_check": { "filter_punctuation": [ true, false ], "language": [ "latin", "polytonic_greek" ] } }, "output": { "metadata": { "metadata": "file", "validate": [ true, false ] }, "tei2hocr": {}, "tei2simplexml": {}, "tei2txt": {} } }
It is also possible to retrieve only a subset of task definitions by adding to the request a task group and/or the task name:
** Request **
GET /tasks/segmentation
** Response **
HTTP/1.1 200 OK { "segmentation": { "kraken": {}, "tesseract": {} } }
Currently there are 4 different argument types:
“int”: An integer
- “float”: A float (floats serialized to integers, i.e. 1.0 to 1
are also accepted)
“str”: An UTF-8 encoded string
“file”: A file on the storage medium, referenced by its URL
Finally there are lists of valid argument values where one or more values out of the list may be picked and value ranges
-
POST
/api/v1/batch
¶ Creates a new batch and returns it identifier.
** Request **
POST /batch
** Response **
HTTP/1.1 201 CREATED { "id": "78a1f1e4-cc76-40ce-8a98-77b54362a00e", "url": "/batch/78a1f1e4-cc76-40ce-8a98-77b54362a00e" }
Status Codes: - 201 Created – Successfully created
-
POST
/api/v1/batch/
(batch_id)/tasks/
(group)/
(task)¶
-
POST
/api/v1/batch/
(batch_id)/tasks/
(group)¶
-
POST
/api/v1/batch/
(batch_id)/tasks
¶ Adds a particular configuration of a task to the batch identified by batch_id.
** Request **
POST /batch/:batch_id/:group/:task
- {
- kwarg_1: “value”, kwarg_2: 10, kwarg_3: ‘true’, kwarg_4: [“a”, “b”], kwarg_5: ‘/pages/:batch_id/path’
}
** Response **
HTTP/1.1 201 CREATED
To post files as arguments use their URL returned by the call that created them on the batch. Booleans are strings containing either the values ‘True’/’true’ or ‘False’/’false’.
Status Codes: - 201 Created – task created
- 404 Not Found – batch, group, or task not found.
-
GET
/api/v1/batch/
(batch_id)/tasks/
(group)/
(task)¶
-
GET
/api/v1/batch/
(batch_id)/tasks/
(group)¶
-
GET
/api/v1/batch/
(batch_id)/tasks
¶ Retrieves the list of tasks and their argument values associated with a batch, optionally limited to a specific group.
** Request **
GET /batch/:batch_id/tasks
** Response **
HTTP/1.1 200 OK { "segmentation": [ ["tesseract", {}] ], "ocr": [ ["kraken", { "model": "teubner", } ] ] }
To limit output to a specific group of tasks, e.g. segmentation or binarization append the group to the URL:
** Request **
GET /batch/:batch_id/tasks/:group
** Response **
HTTP/1.1 200 OK { 'group': [ ["tesseract", {}], ["kraken", {}] ] }
Status Codes: - 200 OK – success
- 404 Not Found – batch, group, or task not found.
-
POST
/api/v1/batch/
(batch_id)/pages
¶ Adds a page (really any type of file) to the batch identified by batch_id.
** Request **
POST /batch/:batch/pages** Response **
HTTP/1.1 201 OK
- [
- {
- “name”: “0033.tif”, “url”: “/pages/63ca3ec7-2592-4c7d-9009-913aac42535d/0033.tif”
}
]
Form Parameters: - scans – file(s) to add to the batch
Status Codes: - 201 Created – task created
- 403 Forbidden – file couldn’t be created
- 404 Not Found – batch not found
-
GET
/api/v1/batch/
(batch_id)/pages
¶ Returns the list of pages associated with the batch with batch_id.
** Request **
GET /batch/:batch/pages
** Response **
HTTP/1.1 200 OK [ { "name": "0033.tif", "url": "/pages/63ca3ec7-2592-4c7d-9009-913aac42535d/0033.tif" }, { "name": "0072.tif", "url": "/pages/63ca3ec7-2592-4c7d-9009-913aac42535d/0072.tif" }, { "name": "0014.tif", "url": "/pages/63ca3ec7-2592-4c7d-9009-913aac42535d/0014.tif" } ]
Status Codes: - 200 OK – success
- 404 Not Found – batch not found
-
GET
/api/v1/pages/
(batch)/
(path: file)¶ Retrieves the file at file in batch batch.
** Request **
GET /pages/:batch/:path
** Response **
HTTP/1.1 200 OK Content-Type: application/octet-stream ...
Parameters: - batch (str) – batch’s unique id
- file (path) – path to the batch’s file
Status Codes: - 200 OK – No error
- 404 Not Found – File not found
-
POST
/api/v1/batch/
(batch_id)¶ Executes batch with identifier batch_id
** Request **
POST /batch/:batch_id
** Response **
HTTP/1.1 202 ACCEPTED
Parameters: - batch_id (string) – batch’s unique id
Status Codes: - 202 Accepted – Successfully executed
- 400 Bad Request – Batch could not be executed
- 404 Not Found – No such batch
- 409 Conflict – Trying to reexecute an already executed batch
-
GET
/api/v1/batch/
(batch_id)¶ Retrieves the state of batch batch_id.
** Request **
GET /batch/:batch_id
** Response **
HTTP/1.1 200 OK
Parameters: - batch_id (string) – batch identifier
Status Codes: - 200 OK – No error
- 404 Not Found – No such batch