README.txt

   1
   2 [This file may describe an older version than the current code]
   3
   4 Trying to make GPTs build their own "culture".
   5
   6 Francois Fleuret
   7 Jun 21st, 2024
   8
   9 * Motivation
  10
  11 The original motivation of this experiment is the hypothesis that
  12 high-level cognition emerges from the competition among humans in the
  13 space of language and ideas.
  14
  15 More precisely, communicating agents try to out-do competitors by
  16 creating stuff that is smart but doable, e.g. some other agents get
  17 it, but not all. Then, that smart thing is added to the "culture",
  18 they all learn and get to understand it, and it repeats.
  19
  20 * Setup
  21
  22 It starts with a "world model" that they got before they communicate,
  23 and from there, they try to "be smart" by proposing quizzes that can
  24 be solved but not by everybody.
  25
  26 There are 5 competing GPTs.
  27
  28 The "world" is a 6x8 grid with three "birds" moving in a straight line
  29 and bouncing on the world's borders. It could be another "world", but
  30 this one has objectness and motion. There are ten colors and 4
  31 directions of motions, so roughly (6x8x4x10)**3 ~ 7e9 states.
  32
  33 Given a random world state, and the state after two iterations of
  34 birds moving, a "quiz" is to predict the second frame, given the
  35 first, or the opposite. The starting and ending states are chosen, by
  36 rejection, so that there is no occlusion.
  37
  38 My home-baked GPT-37M trained with 250k solves this with ~99% success
  39 [to be verified with the new setup].
  40
  41 At every iteration, we select the GPT with the lowest test accuracy,
  42 and run one epoch.
  43
  44 * Creating new quizzes
  45
  46 If its test accuracy got higher than 97.5%, it will create new
  47 quizzes. To do so, it generates a large number of pairs of frames, and
  48 checks which ones of these quizzes are hard but not too hard, which
  49 means [THIS IS THE IMPORTANT BIT]:
  50
  51   it can be solved, in both time directions, by all the other GPTs
  52   **but one**
  53
  54 The both time directions is to avoid a simple type of quizzes which is
  55 simply to deal with noise in the first frame.
  56
  57 The GPT generates 1000 of such quizzes, that are added to the
  58 "culture", i.e. the training set.
  59
  60 We update the test accuracy of all the GPTs, and then we go to the
  61 next iteration.
  62
  63 The hope is that interesting concepts emerge (connectivity, symmetry,
  64 interior/exterior, shape vocabulary, etc.)