1 %% -*- mode: latex; mode: reftex; mode: flyspell; coding: utf-8; tex-command: "pdflatex.sh" -*-
3 %% Any copyright is dedicated to the Public Domain.
4 %% https://creativecommons.org/publicdomain/zero/1.0/
5 %% Written by Francois Fleuret <francois@fleuret.org>
7 \documentclass[11pt,a4paper,oneside]{article}
8 \usepackage[paperheight=15cm,paperwidth=8cm,top=2mm,bottom=15mm,right=5mm,left=5mm]{geometry}
9 %\usepackage[a4paper,top=2.5cm,bottom=2cm,left=2.5cm,right=2.5cm]{geometry}
10 \usepackage[utf8]{inputenc}
11 \usepackage{amsmath,amssymb,dsfont}
12 \usepackage[pdftex]{graphicx}
13 \usepackage[colorlinks=true,linkcolor=blue,urlcolor=blue,citecolor=blue]{hyperref}
16 \usetikzlibrary{arrows,arrows.meta,calc}
17 \usetikzlibrary{patterns,backgrounds}
18 \usetikzlibrary{positioning,fit}
19 \usetikzlibrary{shapes.geometric,shapes.multipart}
20 \usetikzlibrary{patterns.meta,decorations.pathreplacing,calligraphy}
21 \usetikzlibrary{tikzmark}
22 \usetikzlibrary{decorations.pathmorphing}
23 \usepackage[round]{natbib}
24 \usepackage[osf]{libertine}
25 \usepackage{microtype}
27 \usepackage{mleftright}
30 \setlist[itemize]{leftmargin=0pt,itemindent=1em,itemsep=2ex}
31 \setlist{nosep} % or \setlist{noitemsep} to leave space around whole list
33 \newcommand{\setmuskip}[2]{#1=#2\relax}
34 \setmuskip{\thinmuskip}{1.5mu} % by default it is equal to 3 mu
35 \setmuskip{\medmuskip}{2mu} % by default it is equal to 4 mu
36 \setmuskip{\thickmuskip}{3.5mu} % by default it is equal to 5 mu
38 \setlength{\parindent}{0cm}
39 \setlength{\parskip}{1ex}
40 %\renewcommand{\baselinestretch}{1.3}
41 %\setlength{\tabcolsep}{0pt}
42 %\renewcommand{\arraystretch}{1.0}
44 \def\argmax{\operatornamewithlimits{argmax}}
45 \def\argmin{\operatornamewithlimits{argmin}}
47 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
49 \def\given{\,\middle\vert\,}
50 \def\proba{\operatorname{P}}
51 \newcommand{\seq}{{S}}
52 \newcommand{\expect}{\mathds{E}}
53 \newcommand{\variance}{\mathds{V}}
54 \newcommand{\empexpect}{\hat{\mathds{E}}}
55 \newcommand{\mutinf}{\mathds{I}}
56 \newcommand{\empmutinf}{\hat{\mathds{I}}}
57 \newcommand{\entropy}{\mathds{H}}
58 \newcommand{\empentropy}{\hat{\mathds{H}}}
59 \newcommand{\ganG}{\mathbf{G}}
60 \newcommand{\ganD}{\mathbf{D}}
61 \newcommand{\ganF}{\mathbf{F}}
63 \newcommand{\dkl}{\mathds{D}_{\mathsf{KL}}}
64 \newcommand{\djs}{\mathds{D}_{\mathsf{JS}}}
66 \newcommand*{\vertbar}{\rule[-1ex]{0.5pt}{2.5ex}}
67 \newcommand*{\horzbar}{\rule[.5ex]{2.5ex}{0.5pt}}
69 \def\positionalencoding{\operatorname{pos-enc}}
70 \def\concat{\operatorname{concat}}
71 \def\crossentropy{\LL_{\operatorname{ce}}}
73 \newcommand{\separator}{\begin{center}
77 \newcommand{\pic}[2]{%
80 \includegraphics[scale=0.25]{#1}
82 \hspace*{\stretch{1}}%
85 \newcommand{\birdpic}[2]{%
88 \includegraphics[scale=0.35]{#1}
90 \hspace*{\stretch{1}}%
93 \newenvironment{example}{%
97 \begin{minipage}{\textwidth}
99 \setlength{\parindent}{0cm}
100 \setlength{\parskip}{1ex}
111 {\Large Self-Generated Culture}
119 \centerline{\color{red}(work in progress, to be updated)}
123 \centerline{\url{https://fleuret.org/public/culture/culture.pdf}}
127 \section{Introduction}
129 The hypothesis behind this experiment is that high-level abstract
130 thinking is fueled by social competition.
132 A group of communicating agents that try to demonstrate their
133 cognitive superiority would end up developing a rich and consistent
138 The experiment is designed with a group of GPTs that alternatively
139 learn to solve quizzes and generate new ones.
141 A ``quiz'' is a pair composed of a prompt and a solution, both being
144 We differentiate \textbf{world quizzes} that follow pre-defined and
145 fixed regularities, and mimic the world's physical and environmental
146 patterns that an organism has to grasp to survive, and \textbf{culture
147 quizzes} that are generated by the GPTs, and mimic the knowledge one
148 has to master to perform socially.
151 We train five GPTs on a a very large set of ``world quizzes''
152 generated randomly. These models are trained to generate both the
153 solution given the prompt, and the prompt given the solution.
155 This is achieved by using for training both ``forward sequences'',
156 composed of a token \texttt{[fwd]}, followed by the prompt's tokens,
157 followed by another token \texttt{[fwd]}, followed by the solution's
158 tokens, or ``backward sequences'' composed of a token \texttt{[bck]},
159 followed by the solution's tokens, followed by another token
160 \texttt{[bck]}, followed by the prompt's tokens,
162 \subsection{Generating Culture Quizzes}
164 When their accuracy get above $95\%$ we generate new quizzes as follows:
168 \item generate a solution (without conditioning) at temperature $T=2$,
169 then generate a prompt for that solution at temperature $T=1/2$, and
170 then generate a solution for that prompt at temperature $T=1/2$.
172 \item generate one solution for that prompt with each of the $5$ GPTs
173 at temperature $T=1$, if $4$ of them generate the correct solution,
174 validate that quiz and include it in the training data.
178 This criterion assures that the new quizzes are both solvable and
179 sophisticated, and incrementally complexify the culture. Imposing both
180 direction prevents the generation of quizzes which are not trivial
181 only because the prompt has been randomly degraded.
183 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
184 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
185 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
186 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
190 \section{Grid Quizzes}
192 \subsection{World Quizzes}
194 We define several types of quizzes and implement algorithmic
195 procedures to generate randomly as many examples from each that we
198 In these quizzes, the prompt is made of three grids $A, f(A), B$ and
199 the solution is a single grid $f(B)$.
201 \subsubsection{Half Fill}
203 \pic{pics/task_color_grow.png}{``half fill''}
205 The first grid contains three rectangles, each with a vertical or an
206 horizontal line of another color in its middle. The second grid is
207 identical with one of the rectangle having one half filled. The third
208 grid contains three rectangles of identical colors as the firs grid,
209 of different size and locations. The solution is obtained by filling
210 similarly one of the half of a rectangle of the third image.
212 \subsubsection{Detect}
214 \pic{pics/task_detect.png}{``detect''}
216 The first grid contains three rectangles, the second has two pixels of
217 same colors located in the top-left corner of two of them. The
218 solution is obtained by marking in the fourth image the top-left
219 corners of the rectangles of same colors in the third.
221 \subsubsection{Frame}
223 \pic{pics/task_frame.png}{``frame''}
225 The first grid contains three rectangles, and the second is identical
226 except that one rectangle has been replaced by its frame. The same
227 should be done to the similarly colored rectangles of the third grid
228 to obtain the solution.
232 \pic{pics/task_grow.png}{``grow''}
234 The first grid contains three rectangles, one of them getting one
235 pixel thicker or thinner in the second. The same should be done to the
236 similarly colored rectangles of the third grid to get the solution.
238 \subsubsection{Replace color}
240 \pic{pics/task_replace_color.png}{``replace color''}
242 The first grid contains three rectangles, the second is obtained by
243 changing one of the colors. The same should be done to the third grid
244 to obtain the solution.
246 \subsubsection{Translate}
248 \pic{pics/task_translate.png}{``translate''}
250 The first grid contains three rectangles. The second is obtained by
251 displacing one of them by one pixel in both direction. The solution is
252 obtained by applying the same motion to the similarly colored
253 rectangle in the third grid.
255 %% \subsubsection{Bounce}
257 %% \pic{pics/task_bounce.png}{``bounce''}
259 %% The solution should join the two pixels of same color, with a path of
260 %% another color, starting in the direction indicated by a pixel of that
261 %% color, and changing direction only when colliding with a pixel of a
262 %% third color or one of the lattice border.
264 %% \subsubsection{count}
266 %% \pic{pics/task_count.png}{``count''}
268 %% \subsubsection{scale}
270 %% \pic{pics/task_scale.png}{``scale''}
272 %% \subsubsection{trajectory}
274 %% \pic{pics/task_trajectory.png}{``trajectory''}
276 \subsection{Culture Quizzes}
278 We list here some generated quizzes that exhibit features that were not present in the ``world quizzes'' used for training.
284 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_01.png}{0078/01}
286 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_02.png}{0078/02}
296 \pic{pics/culture_c_quiz_0110_N4_validated/quiz_63.png}{0110/63}
298 The quizzes ``frame'' and ``half fill'' have been combined in a single
307 \pic{pics/culture_c_quiz_0087_N4_validated/quiz_62.png}{0087/62}
309 \pic{pics/culture_c_quiz_0102_N4_validated/quiz_04.png}{0102/04}
311 \pic{pics/culture_c_quiz_0102_N4_validated/quiz_11.png}{0102/11}
313 \pic{pics/culture_c_quiz_0108_N4_validated/quiz_31.png}{0108/31}
315 Variation of ``Detect'' with location markers colored according to the
316 color of the rectangle they mark.
324 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_16.png}{0078/16}
326 \pic{pics/culture_c_quiz_0084_N4_validated/quiz_21.png}{0084/21}
328 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_42.png}{0078/42}
330 \pic{pics/culture_c_quiz_0089_N4_validated/quiz_28.png}{0089/28}
332 \pic{pics/culture_c_quiz_0084_N4_validated/quiz_00.png}{0084/00}
334 Variations of ``Half Fill'', ``Detect'', ``Translate'', ``Grow'', and
335 ``Frame'' with a number of rectangles not equal to three.
343 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_27.png}{0078/27}
345 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_18.png}{0078/18}
347 \pic{pics/culture_c_quiz_0086_N4_validated/quiz_45.png}{0086/45}
349 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_37.png}{0078/37}
351 Variations of ``Half Fill'' where the shapes to change have more
360 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_30.png}{0078/30}
362 Variation of ``Translate'' where the moving part is occluded, which
371 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_31.png}{0078/31}
373 \pic{pics/culture_c_quiz_0084_N4_validated/quiz_10.png}{0084/10}
375 \pic{pics/culture_c_quiz_0084_N4_validated/quiz_12.png}{0084/12}
377 \pic{pics/culture_c_quiz_0086_N4_validated/quiz_23.png}{0086/23}
379 \pic{pics/culture_c_quiz_0086_N4_validated/quiz_28.png}{0086/28}
381 Variations of ``Half Fill'' with non-rectangular shapes.
389 \pic{pics/culture_c_quiz_0078_N4_validated/quiz_60.png}{0078/60}
391 \pic{pics/culture_c_quiz_0084_N4_validated/quiz_41.png}{0084/41}
393 \pic{pics/culture_c_quiz_0084_N4_validated/quiz_49.png}{0084/49}
395 \pic{pics/culture_c_quiz_0086_N4_validated/quiz_04.png}{0086/04}
397 Variations of ``Half Fill'' with two colors or two rectangles have to
406 \pic{pics/culture_c_quiz_0111_N4_validated/quiz_23.png}{0111/23}
408 Variation of ``Frame'' with no rectangle of adequate size to be
413 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
414 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
415 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
416 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
422 These results were obtained with a slightly different procedure. In
423 particular the quizzes were validated if the models could predict both
424 the solution from the prompt and the prompt from the solution. We
425 report them since they exhibit the same patterns of generalization
426 although they are quite different.
428 \subsection{World Quizzes}
430 The initial set of quizzes consist of predicting the dynamics of a
431 very simple world: A $6 \times 8$ grid with three colored ``birds'' moving in
432 a straight line, possibly bouncing on the grid's borders. There are
433 ten different colors.
435 \birdpic{pics/examples_train.png}{}
438 In each on these quizzes, $A$ is the left image serialized in
439 raster-scan order as a sequence of $6 \times 8 = 48$ tokens, $d$ is
440 either the token ``forward'' or the token ``backward'', and $B$ is the
441 right image, also serialized. The direction of prediction is chosen at
444 \subsection{Culture quizzes}
446 This procedure results in the discovery of patterns which are not
447 present in the original quizzes:
451 \birdpic{pics/4_birds_1.png}{}
453 \birdpic{pics/5_birds_1.png}{}
455 \birdpic{pics/6_birds_1.png}{}
465 \birdpic{pics/other_shapes_2.png}{}
467 \birdpic{pics/other_shapes_3.png}{}
477 \birdpic{pics/other_shapes_1.png}{}
479 \birdpic{pics/occlusions_1.png}{}
485 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
486 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
487 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
488 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
492 \section{Various thoughts}
496 \item The whole process can be envisioned as natural selection of
497 quizzes in the representation landscape of GPTs. There probably is a
498 subtle relation between the temperature (mutation rate) and the
499 number of models used to validate with the ``all but one'' criterion
500 (survival criterion).
502 \item The ``all but one'' could be ``all but K'', and there may be
503 some information-theoretical thing, where the goal is to maximize
504 mutual information, with $K=N$ being total randomness, so high
505 entropy but no structure, and $K=0$ is total determinism, so no
506 information to share.
508 \item The setup does not push toward any specific invariance or
509 property in the generated quizzes, their consistency is entirely due
510 to the statistics of the ``world quizzes'' that remain in the
511 training set, and to the GPTs' inductive biased.
513 \item The GPTs obviously get a sense of objectness and 2d topology
514 early on, since they rapidly increase the number of birds and
515 ``discover'' occlusion even though they never was in the world
518 \item There may not be so many problems that can be cast as pairs of
519 patterns that are each a deterministic function of the other, which
520 is probably critical here.
522 \item This overall process probably fight the ``simplicity bias'': If
523 a model is lacking a ``cue'' that the others have, there will
524 rapidly be quizzes that require this cue, they will be added to the
525 training data, and that model will catch up.
527 \item The randomness of the process probably allow to even go beyond
528 just synchronizing the abilities of the models. There may be some
529 additional complexification of quizzes that get accepted by chance.
531 \item It can be parallelized by dispatching the GPTs across multiples
532 nodes, and avoiding a quadratic cost by limiting the validation of
533 the quizzes to a subset of them.
535 \item The current process to generate new quizzes, which simply
536 samples them at random is very rudimentary and probably not
537 sufficient in a real-data setup. It can probably be supplemented
538 with a MCTS-type search.
540 \item There may be already in the generated quizzes some structure
541 that \emph{we} do not pick up (e.g. certain color or motion
548 The code is available at
552 \centerline{\url{https://fleuret.org/git/culture}}
554 The experiments are done with a GTX 4090.
556 The GPT used has 37M parameters and the following structure:
560 \texttt{dim\_model} & 512 \\
561 \texttt{dim\_keys} & 64 \\
562 \texttt{dim\_hidden} & 2048 \\
563 \texttt{nb\_heads} & 8 \\
564 \texttt{nb\_blocks} & 12
568 Adam, $\eta = 1e-4$, no scheduling.
570 There are $N_{\text{train}}=250'000$ original quizzes for training and
571 $N_{\text{test}} = 10'000$ for test.
573 At each epoch, for both train and test samples, we mix original
574 quizzes and the generated ones.
576 For training for instance, if there are less than $N_{\text{train}}/2$
577 new quizzes, we take all of them, otherwise we sample
578 $N_{\text{train}}/2$ of them without replacement, and then we sample
579 without replacement enough original quizzes to get $N_{\text{train}}$
582 We proceed similarly to get $N_{\text{test}}$ samples for test.