-# "B", "C"). They keys can only be used in sequence: initially the
-# agent can move only to free spaces, or to the "a", in which case it
-# now carries it, and can move to free spaces or the "A". When it
-# moves to the "A", it gets a reward and loses the "a", but can now
-# move to the "b", etc. Rewards are 1 for "A" and "B" and 10 for "C".
+# "B", "C"). The keys and vault can only be used in sequence:
+# initially the agent can move only to free spaces, or to the "a", in
+# which case the key is removed from the environment and the agent now
+# carries it, and can move to free spaces or the "A". When it moves to
+# the "A", it gets a reward, loses the "a", the "A" is removed from
+# the environment, but can now move to the "b", etc. Rewards are 1 for
+# "A" and "B" and 10 for "C".