Files
PPCA-AIPacMan-2024/reinforcement/test_cases/q1/4-discountgrid.test_output
2024-07-06 01:30:00 +08:00

183 lines
7.5 KiB
Plaintext

Values at iteration 0 are correct.
Student/correct solution:
values_k_0: """
0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 __________ 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 __________ __________ 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000
"""
Q-Values at iteration 0 for action south are correct.
Student/correct solution:
q_values_k_0_action_south: """
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ 0.0000 0.0000
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ __________ 0.0000
illegal 0.0000 0.0000 0.0000 0.0000
"""
Q-Values at iteration 0 for action west are correct.
Student/correct solution:
q_values_k_0_action_west: """
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ 0.0000 0.0000
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ __________ 0.0000
illegal 0.0000 0.0000 0.0000 0.0000
"""
Q-Values at iteration 0 for action exit are correct.
Student/correct solution:
q_values_k_0_action_exit: """
-10.0000 illegal 10.0000 illegal illegal
-10.0000 illegal __________ illegal illegal
-10.0000 illegal 1.0000 illegal illegal
-10.0000 illegal __________ __________ illegal
-10.0000 illegal illegal illegal illegal
"""
Q-Values at iteration 0 for action east are correct.
Student/correct solution:
q_values_k_0_action_east: """
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ 0.0000 0.0000
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ __________ 0.0000
illegal 0.0000 0.0000 0.0000 0.0000
"""
Q-Values at iteration 0 for action north are correct.
Student/correct solution:
q_values_k_0_action_north: """
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ 0.0000 0.0000
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ __________ 0.0000
illegal 0.0000 0.0000 0.0000 0.0000
"""
Values at iteration 1 are NOT correct.
Student solution:
values_k_1: """
0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 __________ 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 __________ __________ 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000
"""
Correct solution:
values_k_1: """
-10.0000 0.0000 10.0000 0.0000 0.0000
-10.0000 0.0000 __________ 0.0000 0.0000
-10.0000 0.0000 1.0000 0.0000 0.0000
-10.0000 0.0000 __________ __________ 0.0000
-10.0000 0.0000 0.0000 0.0000 0.0000
"""
Q-Values at iteration 1 for action south are NOT correct.
Student solution:
q_values_k_1_action_south: """
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ 0.0000 0.0000
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ __________ 0.0000
illegal 0.0000 0.0000 0.0000 0.0000
"""
Correct solution:
q_values_k_1_action_south: """
illegal 0.0000 illegal 0.9000 0.0000
illegal -0.9000 __________ 0.0000 0.0000
illegal -0.8100 illegal 0.0900 0.0000
illegal -0.9000 __________ __________ 0.0000
illegal -0.9000 0.0000 0.0000 0.0000
"""
Q-Values at iteration 1 for action west are NOT correct.
Student solution:
q_values_k_1_action_west: """
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ 0.0000 0.0000
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ __________ 0.0000
illegal 0.0000 0.0000 0.0000 0.0000
"""
Correct solution:
q_values_k_1_action_west: """
illegal -7.2000 illegal 7.2000 0.0000
illegal -7.2000 __________ 0.0000 0.0000
illegal -7.2000 illegal 0.7200 0.0000
illegal -7.2000 __________ __________ 0.0000
illegal -7.2000 0.0000 0.0000 0.0000
"""
Q-Values at iteration 1 for action exit are correct.
Student/correct solution:
q_values_k_1_action_exit: """
-10.0000 illegal 10.0000 illegal illegal
-10.0000 illegal __________ illegal illegal
-10.0000 illegal 1.0000 illegal illegal
-10.0000 illegal __________ __________ illegal
-10.0000 illegal illegal illegal illegal
"""
Q-Values at iteration 1 for action east are NOT correct.
Student solution:
q_values_k_1_action_east: """
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ 0.0000 0.0000
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ __________ 0.0000
illegal 0.0000 0.0000 0.0000 0.0000
"""
Correct solution:
q_values_k_1_action_east: """
illegal 7.2000 illegal 0.0000 0.0000
illegal 0.0000 __________ 0.0000 0.0000
illegal 0.7200 illegal 0.0000 0.0000
illegal 0.0000 __________ __________ 0.0000
illegal 0.0000 0.0000 0.0000 0.0000
"""
Q-Values at iteration 1 for action north are NOT correct.
Student solution:
q_values_k_1_action_north: """
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ 0.0000 0.0000
illegal 0.0000 illegal 0.0000 0.0000
illegal 0.0000 __________ __________ 0.0000
illegal 0.0000 0.0000 0.0000 0.0000
"""
Correct solution:
q_values_k_1_action_north: """
illegal 0.0000 illegal 0.9000 0.0000
illegal -0.9000 __________ 0.0000 0.0000
illegal -0.8100 illegal 0.0900 0.0000
illegal -0.9000 __________ __________ 0.0000
illegal -0.9000 0.0000 0.0000 0.0000
"""