Values at iteration 0 are correct. Student/correct solution: values_k_0: """ __________ 0.0000 __________ 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 __________ 0.0000 __________ """ Q-Values at iteration 0 for action south are correct. Student/correct solution: q_values_k_0_action_south: """ __________ illegal __________ illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal __________ illegal __________ """ Q-Values at iteration 0 for action west are correct. Student/correct solution: q_values_k_0_action_west: """ __________ illegal __________ illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal __________ illegal __________ """ Q-Values at iteration 0 for action exit are correct. Student/correct solution: q_values_k_0_action_exit: """ __________ 10.0000 __________ -100.0000 illegal -100.0000 -100.0000 illegal -100.0000 -100.0000 illegal -100.0000 -100.0000 illegal -100.0000 -100.0000 illegal -100.0000 __________ 1.0000 __________ """ Q-Values at iteration 0 for action east are correct. Student/correct solution: q_values_k_0_action_east: """ __________ illegal __________ illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal __________ illegal __________ """ Q-Values at iteration 0 for action north are correct. Student/correct solution: q_values_k_0_action_north: """ __________ illegal __________ illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal __________ illegal __________ """ Values at iteration 1 are NOT correct. Student solution: values_k_1: """ __________ 0.0000 __________ 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 __________ 0.0000 __________ """ Correct solution: values_k_1: """ __________ 10.0000 __________ -100.0000 0.0000 -100.0000 -100.0000 0.0000 -100.0000 -100.0000 0.0000 -100.0000 -100.0000 0.0000 -100.0000 -100.0000 0.0000 -100.0000 __________ 1.0000 __________ """ Q-Values at iteration 1 for action south are NOT correct. Student solution: q_values_k_1_action_south: """ __________ illegal __________ illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal __________ illegal __________ """ Correct solution: q_values_k_1_action_south: """ __________ illegal __________ illegal -8.5000 illegal illegal -8.5000 illegal illegal -8.5000 illegal illegal -8.5000 illegal illegal -7.7350 illegal __________ illegal __________ """ Q-Values at iteration 1 for action west are NOT correct. Student solution: q_values_k_1_action_west: """ __________ illegal __________ illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal __________ illegal __________ """ Correct solution: q_values_k_1_action_west: """ __________ illegal __________ illegal -76.0750 illegal illegal -76.5000 illegal illegal -76.5000 illegal illegal -76.5000 illegal illegal -76.4575 illegal __________ illegal __________ """ Q-Values at iteration 1 for action exit are correct. Student/correct solution: q_values_k_1_action_exit: """ __________ 10.0000 __________ -100.0000 illegal -100.0000 -100.0000 illegal -100.0000 -100.0000 illegal -100.0000 -100.0000 illegal -100.0000 -100.0000 illegal -100.0000 __________ 1.0000 __________ """ Q-Values at iteration 1 for action east are NOT correct. Student solution: q_values_k_1_action_east: """ __________ illegal __________ illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal __________ illegal __________ """ Correct solution: q_values_k_1_action_east: """ __________ illegal __________ illegal -76.0750 illegal illegal -76.5000 illegal illegal -76.5000 illegal illegal -76.5000 illegal illegal -76.4575 illegal __________ illegal __________ """ Q-Values at iteration 1 for action north are NOT correct. Student solution: q_values_k_1_action_north: """ __________ illegal __________ illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal illegal 0.0000 illegal __________ illegal __________ """ Correct solution: q_values_k_1_action_north: """ __________ illegal __________ illegal -0.8500 illegal illegal -8.5000 illegal illegal -8.5000 illegal illegal -8.5000 illegal illegal -8.5000 illegal __________ illegal __________ """