Advanced Job Usage
Prevent Output Buffering
By default, stdout is buffered. To change that behaviour you can use stdbuf
.
eai job submit -- stdbuf -oL '{COMMAND}'
Redirect stdout and stderr to a File
The following command redirects both stdout and stderr to a single file named with the job id.
eai job submit -- bash -c '{COMMAND} > /home/toolkit/${EAI_JOB_ID}.txt 2>&1'
If you want to keep output from all the runs of a preemptable job, you might want to append to the same log file …
eai job submit -- bash -c '{COMMAND} >> /home/toolkit/${EAI_JOB_ID}.txt 2>&1'
… or use the EAI_RUN_INDEX
to have one log file per run.
eai job submit -- bash -c '{COMMAND} > /home/toolkit/${EAI_JOB_ID}.${EAI_RUN_INDEX}.txt 2>&1'
Run a “shell” on a job
# launching a job that will stick around
eai job submit -- bash -c 'while true; do sleep 60; done;'
# start an interactive pseudo terminal into it
eai job exec --last -- bash
Monitor a Job Resources Usage
Resources usage can be consulted on Webapp by clicking on a job and on the link Resource usage
.
The links to the dashboards are also available in the CLI, by calling:
$ eai job info $JOB_ID
command:
- sleep
- "300"
...
runs:
- id: 5697ab0b-30de-439d-a833-1e70ab2d8d40
jobId: ae91489a-efb3-4bf0-8ff1-e780d4ff68ac
...
resourceUsageUrl: https://graphs.console.elementai.com/to/resource_url/ae91489a-efb3-4bf0-8ff1-e780d4ff68ac/runs/5697ab0b-30de-439d-a833-1e70ab2d8d40
...
resourceUsageUrl: https://graphs.console.elementai.com/to/resource_url/ae91489a-efb3-4bf0-8ff1-e780d4ff68ac?full
Warning
The only for users to view job resource usages is through Grafana. The API endpoints job_usage are poorly named and yield resource occupation instead of usage.
Port-forward to a job
To port-forward to a job, the job requires to be started with the flag --tunnel
.
# launching a job that will stick around
eai job submit --tunnel -i python:3 -- python3 -m http.server 8080
# start a port-forward to the job
eai job port-forward --last 8080
# access to the job
curl http://localhost:8080
Access to a gRPC server in a job/service
Toolkit supports jobs running gRPC server.
On internal Toolkit cluster like Superpod, Toolkit can directly detect the gRPC request to use HTTP2/H2C protocol.
For Toolkit cluster on https://console.elementai.com, you need to use another domain name dedicated to gRPC:
https://jobID.job-grpc.elementai.com
https://jobID-portNumber.job-grpc.elementai.com
https://serviceID.service-grpc.elementai.com
instead of:
https://jobID.job.elementai.com
https://jobID-portNumber.job.elementai.com
https://serviceID.service.elementai.com
Note
For debug purpose, it is possible to set the header Request-Type: h2c
to use HTTP2/H2C protocol.