FAQ

Can I delete an account, a data or a job?

No, not yet.

The motivation today is to enable users to have some level of traceability. Keeping jobs around and the data they used is useful in that regard. Of course, it is sometimes clear that some data and jobs are temporary and/or manipulation errors and have no side effects. We just have to make put safeguards in place so that anything that gets deleted does not break traceability unwillingly.

Note that it’s perfectly fine to have a “workspace” data that gets reused between jobs. This way, its content can be wiped when you’re done.

Is there a way to list your images in the private registry?

Not yet.

Why the default sandbox account doesn’t have admin and user roles?

The sandbox account is meant to be a place where you can experiment without having to be concerned about security. While you can manually create roles in this account, we don’t create them by default. If you do, it’s not something that you will do by accident so we assume that you know the implications. What we suggest is that you share using sub accounts. There are some details on How To Share.

Does data pull work like rsync and only pulls the changes? or does it always pull everything?

It currently pulls everything. It’s on the roadmap to do this incrementally.

In the meantime, if you still need to use rsync:

eai job new --image registry.console.elementai.com/shared.image/sshd --data your-data:/data:ro --restartable
eai job port-forward --last 2222

Then, in another terminal, use rsync:

rsync -avzh _toolchain@localhost:222 /your/folder

Just be sure that you run this in a safe environment i.e. NOT from shared workstations. Anyone that has access to that 2222 port can connect to it i.e. there’s no authentication on the SSH tunnel.

Can I set the profile for a session or a script so that I don’t have to specify --profile on every command ?

Yes to do so set the environment variable EAI_PROFILE

$ export EAI_PROFILE=dev
$ eai job new -- echo 'Hello World'
$ eai job info --last

The two commands will use the dev profile.

Why do I get disconnected from my eai job exec or my eai job log -f?

This is a problem on our side. A ticket is open so we’ll address it when we get a chance.

In the meantime, using the EAI VPN avoids the problem.

Can I change the ownership of a data ? Can I move a data to another account ?

Since the cli release (0.9.66), a new command is available: eai data chown ...

Before being able to move a data to another account, note that you need to have a few different policies:

  • Write access to the data (data:set): for the duration of the moving process, the data needs to be locked to avoid being modified by any users or jobs during the move

  • Write access to the source account (account:set): moving a data implies modifying an account

  • Write access to the destination account (account:set): moving a data implies modifying an account

See Can I set the profile for a session or a script so that I don’t have to specify --profile on every command ? section or How to Share section.

Once all your permissions are set, run the following command:

$ eai data chown DATA_ID DESTINATION_ACCOUNT

How often are the backups made?

The backup are done every Friday. Contact IT if you need a backup to be restored.

How does scheduling work?

There are two key concepts to Toolkit scheduling: resource usage and accounts.

At a high level, the scheduler strategy is a per account resource-based fair scheduling. It means that the scheduling priority goes in reverse order with resource usage. The fewer resources an account is using, the higher the priority. An account not running a single job has top priority.

On top of that a few things are taken into consideration:

  1. Each user, and not account, has the right to launch a single interractive job, which will be given the top priority. This by-passes regular scheduling.

  2. Per account, the scheduling is made with two criteria: #. Highest bid goes first. #. FIFO (First In First Out).

Warning

Since scheduling is account based, if many users are launching jobs within the same account, it still has the weight of a single account, which often makes user feel their jobs are not fairly scheduled. There is currently no Toolkit configuration to deal with that situation.

Why a Fair Share Scheduling strategy?

This strategy was chosen for two of its benefits.

  1. If an account with no jobs comes in after other accounts are using the cluster at 100% with hundreds of jobs in the queue, that new account doesn’t have to wait for the new jobs to run.

  2. It’s an incentive to request the right amount of resources. If an account’s jobs are taking more resources than they should, whenever the cluster is full, this account has fewer jobs running in parallel than it would if they were configured to use fewer resources.