Duckietown Challenges Home Challenges Submissions

Job 60545

Job ID60545
submission13284
userAndrás Kalapos 🇭🇺
user labelreal-v1.0-3092-363
challengeaido5-LF-real-validation
stepeval0
statusaborted
up to dateyes
evaluator33
date started
date completed
duration0:06:15
message
Operator message: '' [...]
Operator message: ''
Logs:
DEBUG:commons:version: 6.1.7 *
INFO:typing:version: 6.1.8
DEBUG:aido_schemas:aido-protocols version 6.0.33 path /usr/local/lib/python3.8/dist-packages
INFO:nodes:version 6.1.1 path /usr/local/lib/python3.8/dist-packages pyparsing 2.4.6
2020-12-11 21:10:57.484773: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-12-11 21:11:00.240024: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:
2020-12-11 21:11:00.240064: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-12-11 21:11:00.240112: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
DEBUG:ipce:version 6.0.36 path /usr/local/lib/python3.8/dist-packages
INFO:nodes_wrapper:checking implementation
INFO:nodes_wrapper:checking implementation OK
DEBUG:nodes_wrapper:run_loop
  fin: /fifos/ego0-in
 fout: fifo:/fifos/ego0-out
INFO:nodes_wrapper:Fifo /fifos/ego0-out created. I will block until a reader appears.
INFO:nodes_wrapper:Fifo reader appeared for /fifos/ego0-out.
INFO:nodes_wrapper:Node RLlibAgent starting reading
 fi_desc: /fifos/ego0-in
 fo_desc: fifo:/fifos/ego0-out
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: init()
WARNING:config.config:Found paths with seed 3092:
WARNING:config.config:0: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/config_dump_3092.yml
WARNING:config.config:Found checkpoints in ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47:
WARNING:config.config:0: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
WARNING:config.config:Config loaded from ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/config_dump_3092.yml
WARNING:config.config:Model checkpoint loaded from ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
WARNING:config.config:Updating default config values by: 
 env_config:
  mode: inference

WARNING:config.config:Env_config.mode is 'inference', some hyperparameters will be overwritten by: 
 rllib_config:
  num_workers: 0
  num_gpus: 0
  callbacks: {}
ray_init_config:
  num_cpus: 1
  memory: 2097152000
  object_store_memory: 209715200
  redis_max_memory: 209715200
  local_mode: true

INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: === Wrappers ===================================
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Observation wrappers
 <ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>
<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>
<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>
<NormalizeWrapper<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>>
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Action wrappers
 <Heading2WheelVelsWrapper<NormalizeWrapper<ObservationBufferWrapper<ResizeWrapper<ClipImageWrapper<DummyDuckietownGymLikeEnv instance>>>>>>
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Reward wrappers
 
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: === Config ===================================
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: seed: 3092
experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
algo: PPO
algo_config_files:
  PPO: config/algo/ppo.yml
  general: config/algo/general.yml
env_config:
  mode: inference
  episode_max_steps: 500
  resized_input_shape: (84, 84)
  crop_image_top: true
  top_crop_divider: 3
  grayscale_image: false
  frame_stacking: true
  frame_stacking_depth: 3
  motion_blur: false
  action_type: heading
  reward_function: posangle
  distortion: true
  accepted_start_angle_deg: 30
  simulation_framerate: 30
  frame_skip: 3
  action_delay_ratio: 0.0
  training_map: multimap_aido5
  domain_rand: true
  dynamics_rand: true
  camera_rand: true
  frame_repeating: 0.0
  spawn_obstacles: false
  obstacles:
    duckie:
      density: 0.5
      static: true
    duckiebot:
      density: 0
      static: false
  spawn_forward_obstacle: false
  aido_wrapper: true
  wandb:
    project: duckietown-rllib
  experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
  seed: 3092
ray_init_config:
  num_cpus: 1
  webui_host: 127.0.0.1
  memory: 2097152000
  object_store_memory: 209715200
  redis_max_memory: 209715200
  local_mode: true
restore_seed: 3091
restore_experiment_idx: 0
restore_checkpoint_idx: 0
debug_hparams:
  rllib_config:
    num_workers: 1
    num_gpus: 0
  ray_init_config:
    num_cpus: 1
    memory: 2097152000
    object_store_memory: 209715200
    redis_max_memory: 209715200
    local_mode: true
inference_hparams:
  rllib_config:
    num_workers: 0
    num_gpus: 0
    callbacks: {}
  ray_init_config:
    num_cpus: 1
    memory: 2097152000
    object_store_memory: 209715200
    redis_max_memory: 209715200
    local_mode: true
timesteps_total: 4000000.0
rllib_config:
  num_workers: 0
  sample_batch_size: 265
  num_gpus: 0
  train_batch_size: 4096
  gamma: 0.99
  lr: 5.0e-05
  monitor: false
  evaluation_interval: 25
  evaluation_num_episodes: 2
  evaluation_config:
    monitor: false
    explore: false
  seed: 1234
  lambda: 0.95
  sgd_minibatch_size: 128
  vf_loss_coeff: 0.5
  entropy_coeff: 0.0
  clip_param: 0.2
  vf_clip_param: 0.2
  grad_clip: 0.5
  env: Duckietown
  callbacks: {}
  env_config:
    mode: inference
    episode_max_steps: 500
    resized_input_shape: (84, 84)
    crop_image_top: true
    top_crop_divider: 3
    grayscale_image: false
    frame_stacking: true
    frame_stacking_depth: 3
    motion_blur: false
    action_type: heading
    reward_function: posangle
    distortion: true
    accepted_start_angle_deg: 30
    simulation_framerate: 30
    frame_skip: 3
    action_delay_ratio: 0.0
    training_map: multimap_aido5
    domain_rand: true
    dynamics_rand: true
    camera_rand: true
    frame_repeating: 0.0
    spawn_obstacles: false
    obstacles:
      duckie:
        density: 0.5
        static: true
      duckiebot:
        density: 0
        static: false
    spawn_forward_obstacle: false
    aido_wrapper: true
    wandb:
      project: duckietown-rllib
    experiment_name: PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand
    seed: 3092

2020-12-11 21:11:03,758	INFO trainer.py:428 -- Tip: set 'eager': true or the --eager flag to enable TensorFlow eager execution
2020-12-11 21:11:03,777	ERROR syncer.py:39 -- Log sync requires rsync to be installed.
2020-12-11 21:11:03,778	WARNING deprecation.py:29 -- DeprecationWarning: `sample_batch_size` has been deprecated. Use `rollout_fragment_length` instead. This will raise an error in the future!
2020-12-11 21:11:03,778	INFO trainer.py:583 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
2020-12-11 21:11:03.793170: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-12-11 21:11:03.800627: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2599935000 Hz
2020-12-11 21:11:03.801289: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x90734d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-12-11 21:11:03.801341: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-12-11 21:11:08,963	INFO trainable.py:217 -- Getting current IP.
2020-12-11 21:11:08,964	WARNING util.py:37 -- Install gputil for GPU system monitoring.
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Restoring checkpoint from: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
2020-12-11 21:11:09,040	INFO trainable.py:217 -- Getting current IP.
2020-12-11 21:11:09,040	INFO trainable.py:422 -- Restored on 172.17.0.2 from checkpoint: ./models/PPO-RLlib-AIDO5_FrameSkip3_NewMaps_StartAngle30_AIDOWrapper_DomainRand_3092/Dec10_00-31-47/PPO_0_2020-12-10_00-31-48u8cipgyq/checkpoint_363/checkpoint-363
2020-12-11 21:11:09,040	INFO trainable.py:430 -- Current state after restoring: {'_iteration': 363, '_timesteps_total': 1539120, '_time_total': 110224.9016327858, '_episodes_total': 8614}
INFO:nodes_wrapper:5f97768fd25d:RLlibAgent: Starting episode "episode".
ERROR:nodes_wrapper:Error in node RLlibAgent: 
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
    handle_message_node(parsed, receiver0, context0)
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
    call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
    f(**kwargs)
  File "solution.py", line 61, in on_received_observations
    new_image = jpg2rgb(camera.jpg_data)
  File "solution.py", line 113, in jpg2rgb
    im = im.convert('RGB')
  File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 902, in convert
    self.load()
  File "/usr/local/lib/python3.8/dist-packages/PIL/ImageFile.py", line 261, in load
    n, err_code = decoder.decode(b)
  File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
    sys.exit(signal.SIGTERM)
SystemExit: Signals.SIGTERM

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 243, in run_loop
    loop(node_name, fi, fo, node, protocol, tin, tout, config=config, fi_desc=fin, fo_desc=fout)
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 378, in loop
    raise InternalProblem(msg) from e  # XXX
zuper_nodes.structures.InternalProblem: Exception while handling a message on topic "observations".

| Traceback (most recent call last):
|   File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
|     handle_message_node(parsed, receiver0, context0)
|   File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
|     call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
|   File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
|     f(**kwargs)
|   File "solution.py", line 61, in on_received_observations
|     new_image = jpg2rgb(camera.jpg_data)
|   File "solution.py", line 113, in jpg2rgb
|     im = im.convert('RGB')
|   File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 902, in convert
|     self.load()
|   File "/usr/local/lib/python3.8/dist-packages/PIL/ImageFile.py", line 261, in load
|     n, err_code = decoder.decode(b)
|   File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
|     sys.exit(signal.SIGTERM)
| SystemExit: Signals.SIGTERM
| 

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
    handle_message_node(parsed, receiver0, context0)
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
    call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
    f(**kwargs)
  File "solution.py", line 61, in on_received_observations
    new_image = jpg2rgb(camera.jpg_data)
  File "solution.py", line 113, in jpg2rgb
    im = im.convert('RGB')
  File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 902, in convert
    self.load()
  File "/usr/local/lib/python3.8/dist-packages/PIL/ImageFile.py", line 261, in load
    n, err_code = decoder.decode(b)
  File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
    sys.exit(signal.SIGTERM)
SystemExit: Signals.SIGTERM

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 243, in run_loop
    loop(node_name, fi, fo, node, protocol, tin, tout, config=config, fi_desc=fin, fo_desc=fout)
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 378, in loop
    raise InternalProblem(msg) from e  # XXX
zuper_nodes.structures.InternalProblem: Exception while handling a message on topic "observations".

| Traceback (most recent call last):
|   File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 355, in loop
|     handle_message_node(parsed, receiver0, context0)
|   File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 531, in handle_message_node
|     call_if_fun_exists(agent, expect_fn, data=ob, context=context, timing=timing)
|   File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/utils.py", line 21, in call_if_fun_exists
|     f(**kwargs)
|   File "solution.py", line 61, in on_received_observations
|     new_image = jpg2rgb(camera.jpg_data)
|   File "solution.py", line 113, in jpg2rgb
|     im = im.convert('RGB')
|   File "/usr/local/lib/python3.8/dist-packages/PIL/Image.py", line 902, in convert
|     self.load()
|   File "/usr/local/lib/python3.8/dist-packages/PIL/ImageFile.py", line 261, in load
|     n, err_code = decoder.decode(b)
|   File "/usr/local/lib/python3.8/dist-packages/ray/worker.py", line 881, in sigterm_handler
|     sys.exit(signal.SIGTERM)
| SystemExit: Signals.SIGTERM
| 

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "solution.py", line 127, in <module>
    main()
  File "solution.py", line 123, in main
    wrap_direct(node=node, protocol=protocol)
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/interface.py", line 24, in wrap_direct
    run_loop(node, protocol, args)
  File "/usr/local/lib/python3.8/dist-packages/zuper_nodes_wrapper/wrapper.py", line 251, in run_loop
    raise Exception(msg) from e
Exception: Error in node RLlibAgent
Artefacts hidden. If you are the author, please login using the top-right link or use the dashboard.

Highlights

Artifacts

The artifacts are hidden.

Container logs

The logs are hidden.