AI Driving Olympics Home Challenges Submissions Jobs

Challenge "[Test] Multi-step evaluation"

Challenge description

multi step

Leaderboard

Submissions

Challenge logistics

Scoring

Scoring criteria

These are the metrics defined:

passed-step1

1 if the submission passed the first step.

passed-step2

1 if the submission passed the second step.

Details

Technical details

Evaluation steps details

  • At the beginning execute step step1.

  • If step step1 finishes with status success, then execute step step2.

  • If step step1 finishes with status failed, then declare the submission FAILED.

  • If step step1 finishes with status error, then declare the submission ERROR.

  • If step step2 finishes with status success, then declare the submission SUCCESS.

  • If step step2 finishes with status failed, then declare the submission FAILED.

  • If step step2 finishes with status error, then declare the submission ERROR.

Evaluation step step1

Timeout 100.0

In this first step, we run the container step1-evaluator and the solution container.

This is the Docker Compose configuration skeleton:

version: '3'
services:
    solution:
        image: SUBMISSION_CONTAINER
        environment: {}
    step1-evaluator:
        image: docker.io/andreacensi/aido2-ms-regular-step1-step1-evaluator:2019_05_13_15_59_33@sha256:74faab735ba4ddad5d3672d24fee8dd444384036f8a3e65f409cd8910a874099
        environment: {}

The text SUBMISSION_CONTAINER will be replaced with the user containter.

Resources required for evaluating this step

No particular resources required.

Evaluation step step2

Timeout 100.0

In the second step, we only run a second container called step2-evaluator.

This is the Docker Compose configuration skeleton:

version: '3'
services:
    step2-evaluator:
        image: docker.io/andreacensi/aido2-ms-regular-step2-step2-evaluator:2019_05_13_16_00_07@sha256:74faab735ba4ddad5d3672d24fee8dd444384036f8a3e65f409cd8910a874099
        environment: {}

The text SUBMISSION_CONTAINER will be replaced with the user containter.

Resources required for evaluating this step

No particular resources required.