Values at iteration 0 are correct. Student/correct solution: values_k_0: """ 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 __________ 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 __________ __________ 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 """ Q-Values at iteration 0 for action south are correct. Student/correct solution: q_values_k_0_action_south: """ illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ 0.0000 0.0000 illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ __________ 0.0000 illegal 0.0000 0.0000 0.0000 0.0000 """ Q-Values at iteration 0 for action west are correct. Student/correct solution: q_values_k_0_action_west: """ illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ 0.0000 0.0000 illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ __________ 0.0000 illegal 0.0000 0.0000 0.0000 0.0000 """ Q-Values at iteration 0 for action exit are correct. Student/correct solution: q_values_k_0_action_exit: """ -10.0000 illegal 10.0000 illegal illegal -10.0000 illegal __________ illegal illegal -10.0000 illegal 1.0000 illegal illegal -10.0000 illegal __________ __________ illegal -10.0000 illegal illegal illegal illegal """ Q-Values at iteration 0 for action east are correct. Student/correct solution: q_values_k_0_action_east: """ illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ 0.0000 0.0000 illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ __________ 0.0000 illegal 0.0000 0.0000 0.0000 0.0000 """ Q-Values at iteration 0 for action north are correct. Student/correct solution: q_values_k_0_action_north: """ illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ 0.0000 0.0000 illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ __________ 0.0000 illegal 0.0000 0.0000 0.0000 0.0000 """ Values at iteration 1 are NOT correct. Student solution: values_k_1: """ 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 __________ 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 __________ __________ 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 """ Correct solution: values_k_1: """ -10.0000 0.0000 10.0000 0.0000 0.0000 -10.0000 0.0000 __________ 0.0000 0.0000 -10.0000 0.0000 1.0000 0.0000 0.0000 -10.0000 0.0000 __________ __________ 0.0000 -10.0000 0.0000 0.0000 0.0000 0.0000 """ Q-Values at iteration 1 for action south are NOT correct. Student solution: q_values_k_1_action_south: """ illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ 0.0000 0.0000 illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ __________ 0.0000 illegal 0.0000 0.0000 0.0000 0.0000 """ Correct solution: q_values_k_1_action_south: """ illegal 0.0000 illegal 0.9000 0.0000 illegal -0.9000 __________ 0.0000 0.0000 illegal -0.8100 illegal 0.0900 0.0000 illegal -0.9000 __________ __________ 0.0000 illegal -0.9000 0.0000 0.0000 0.0000 """ Q-Values at iteration 1 for action west are NOT correct. Student solution: q_values_k_1_action_west: """ illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ 0.0000 0.0000 illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ __________ 0.0000 illegal 0.0000 0.0000 0.0000 0.0000 """ Correct solution: q_values_k_1_action_west: """ illegal -7.2000 illegal 7.2000 0.0000 illegal -7.2000 __________ 0.0000 0.0000 illegal -7.2000 illegal 0.7200 0.0000 illegal -7.2000 __________ __________ 0.0000 illegal -7.2000 0.0000 0.0000 0.0000 """ Q-Values at iteration 1 for action exit are correct. Student/correct solution: q_values_k_1_action_exit: """ -10.0000 illegal 10.0000 illegal illegal -10.0000 illegal __________ illegal illegal -10.0000 illegal 1.0000 illegal illegal -10.0000 illegal __________ __________ illegal -10.0000 illegal illegal illegal illegal """ Q-Values at iteration 1 for action east are NOT correct. Student solution: q_values_k_1_action_east: """ illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ 0.0000 0.0000 illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ __________ 0.0000 illegal 0.0000 0.0000 0.0000 0.0000 """ Correct solution: q_values_k_1_action_east: """ illegal 7.2000 illegal 0.0000 0.0000 illegal 0.0000 __________ 0.0000 0.0000 illegal 0.7200 illegal 0.0000 0.0000 illegal 0.0000 __________ __________ 0.0000 illegal 0.0000 0.0000 0.0000 0.0000 """ Q-Values at iteration 1 for action north are NOT correct. Student solution: q_values_k_1_action_north: """ illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ 0.0000 0.0000 illegal 0.0000 illegal 0.0000 0.0000 illegal 0.0000 __________ __________ 0.0000 illegal 0.0000 0.0000 0.0000 0.0000 """ Correct solution: q_values_k_1_action_north: """ illegal 0.0000 illegal 0.9000 0.0000 illegal -0.9000 __________ 0.0000 0.0000 illegal -0.8100 illegal 0.0900 0.0000 illegal -0.9000 __________ __________ 0.0000 illegal -0.9000 0.0000 0.0000 0.0000 """